My name is Seokki Lee and I’m the intern working on “DataONE Project 1: Sharing reproducible research through DataONE and Whole Tale”. The project focuses on using provenance tools and standards from DataONE enabling the result of computational syntheses from empirical science to be stored in a fully reproducible and transparent manner. I will be posting here on the blog a general weekly update of my activities.
This first week I had a chance to have a quick trip to meet my primary mentor Bertram Ludaescher at UIUC. We spent time to discuss the project goals and clarification of the scope of activities. We had a talk via online with Dave Vieglais from university of Kansas and Bryce Mecum who is a scientific software engineer at NCEAS to get a general overview of the DataONE and how provenance is used in it with a short demonstration through NSF Artic Data Center with using R. Along with that, I also had a chance to have a high level overview of a Whole Tale project for reproducibility which would be a potential collabaration recommended.
Since I have not had much experience about provenance for scientific research, I settled my goals for the first week to study the usage of provenance in a scientific field by having several references (particularly spent time to finish some papers for provenance in workflows and R). In addition to this, another fundamental requirement understanding DataONE (and also Whole Tale later) has been doing by using those useful documents provided by Dave and Bryce, e.g., training documents for DataONE and R, documents for DataONE packaging model, and ProvOne. I have been absorbing the knowledge about the structure and components as well as the use of the systems with examples in the documents and having some simple excercises.
As a next step, I will expand the use of the systems for the actual computational study (also play more through Artic Data Center), and focus on the use of provenance for the packages in the study.
Hope you all have a long nice weekend.