Documentation & Integration – DataONE Notebooks

I switched gears a little bit this week. Instead of focusing on developing new modules, I mainly worked on documentation and integration.

For the IMIF package, a good help document is very important. Last week I used existing Python tools called pydoc to create documents for it. However, it’s not that flexible and the generated html help document looks not that “professional”. After talking with my mentor, we decided to create separate web pages for each module and the help document should be well organized, so we need a more flexible documentation tool. Finally I found a powerful tool called Sphinx. Many existing documents of python packages were created with this tool. It uses a new markup language called reStructuredText. After preparing documents for each module using this language, it automatically generated well-organized help documents with professional looking. I really love this tool. Just as one of slogan in its website says “Cheers for a great tool that actually makes programmers want to write documentation!”

The major work I’ve been working on this week is integrating the modules I developed with the visualization modules developed by a DataONE Postdoc and a previous summer intern. The integration include two parts: one is put them together and another one is to use them together.

Putting them together is relatively easy. I merged their visualization modules into the IMIF package. In order to facilitate the users, I organized them into 4 different categories:

data: data input/output modules
processing: basic data processing modules
analysis: spatial temporal analysis/statistics modules
visualization: visual tools/plots

Using them together means link modules I develop with the visualization modules to create some meaningful workflows. Most of the visualization modules take matrix as input. Also it assume different input matrix for different visualization tools. Therefore, in order to link the modules I developed with them, I need to prepare matrix from the output of my modules so that they can be fed into the visualization modules. I added a new module called “GetMatrixFromVariables” for this purpose but it is still a work in progress. The challenge comes from two sides: On one hand, the output from the modules I developed are diverse, some have time/lat/lon axis, some only have time axis and some have lat/lon axis without time. On the other hand, though the visualization modules require a matrix as input, they assume the matrix are already in their preferred format (e.g. order). I finished a test workflow for demo purpose this week, but more work is needed.

Leave a Reply Cancel reply