Week Eight – DataONE Notebooks

The primary task these days is to integrate new information to old data set and regenerate visualization. This week I got some new data from my Mentor and I integrated them into previous data set. Because I processed data by different stages, such as integrate information, remove duplicates, and check cross-reference, now I have different versions of data and codes. Therefore, I need to be very careful to add information to the right version of file and reproduce some data. Also, it took me some time not only to integrate data in final version of files, but also to add extra affiliation and role information to the visualization data, which has different format. However, adding the extra information into visualization made the graph very complicated and hard to read. It might be caused by that only a few subjects have those information. So after discussing with my mentor, I regenerated a new visualization using only those data points with affiliation and role information available. Now different visualization can show different views of the data.

Because this is the second last week, in order to present the visualization to the public, I added subject ID to data files and replaced subject’s names with IDs in the graphs. The graphs now look nice and clean with SNS and mailing lists’ name and participants’ IDs. Now I need to re-check everything including codes, data, and visualization, and write comments and instruction about each step to generate the visualization. This internship is so interesting; I learned many things from those challenges I met and overcame. I can hardly believe this is the eighth week.

Leave a Reply Cancel reply