In our 8th week, we worked with project mentors to refine our systematic review database and analysis. The main question for our internship was whether a selection of data-aggregation studies could be repeated through repositories available in DataONE. During the preceding weeks we extracted dozens of pieces of data from each manuscript we included in the systematic review. This week, we directly compared data sources aggregated in our selection of 80 papers to repositories on DataONE and repositories on the well-known list of repos at https://www.re3data.org/
Giancarlo presented a series of statistical models (and figure below) during our weekly check-in with accompanying figures. We found that certain types of data tended to be more accessible. For example, data stored as databases were (in general) more accessible than spatial data. We also had an interesting discussion on the best ways to visually present our results in the manuscript. A fun part of the internship has been collaborating on everything from project design to writing the manuscript.
We also continued work on the best practices documents. Our main product was a series of best practices for data citation in data aggregation research. Our mentor Megan is providing edits to this document. We also created two other figures to accompany the best practices. The first (show below) summarizes the best practices in one concise figure. The second graphically documents the steps we took to conduct the systematic review.