The majority of this week’s work was very collaborative, so today’s blog post is also a collaboration. Our main goals for Week #2 of our internship revolved around fine tuning our database for the systematic review.
Last week, Rob’s blog post highlighted one way to reduce bias in a systematic review–outlining search criteria prior to beginning the review. By pre-defining search criteria, we avoid the possibility of selecting a biased set of articles that we’re most familiar with. Giancarlo has put this into practice to collect a set of articles (via Web of Science) most likely to represent true data syntheses while Rob has focused on articles produced by members of the National Center for Ecological Analysis and Synthesis. The ongoing review of these articles has revealed data syntheses covering a wide range of ecological topics.
This week was another exercise in bias-prevention. After an extensive and fruitful planning meeting with our DataONE mentor, Megan to develop research questions, hypotheses, and predictions, we pre-defined the pieces of data we’ll extract from each article in the systematic review. By establishing the data extraction categories a priori, we know the exact information we need from each article. We also try to establish all the possible “levels” within a single category. As we have been simultaneous “piloting” data extraction from a small but diverse set of data synthesis articles, the process of database development has evolved as the breadth of data citation practices and data accessibility among data synthesis articles has become clearer.
One piece of information we are particularly excited to extract from each article is called “data distance.” This indicates how much effort is required to access the data used for analysis in an original study. Since there is no established metric for data distance the internship team had a series of (very friendly!) debates on how to define the different levels within the category data distance. We look forward to reporting on results of the data distance measurements and the many other pieces of data we’ll extract over the next two weeks.