With regard to the summer internship at the DataONE, the general goal of the project 4 is to document citizen participation, data collection procedures, and analysis tasks undertaken by citizens through literature review process. Our team, especially, has more interests in tuning data quality. To make the review and synthesis of prior researches more systematic, we will utilize the method of meta-analysis.
Thus, I started to collect the list of relevant papers with keywords “Data Quality” and “Citizen Science” or “Volunteer Geographic Information” or “Volunteer monitor”, and arranged the list in an excel file. The papers were also searched by the Scopus and downloaded by collaborating with other team members. Now we have the list of 77 papers and have 35 papers downloaded. Scopus showed 77 papers with the terms, but 42 papers cannot be downloaded. We have to figure out how we can acquire access to the rest of the papers. It is important to understand and make agreement on attributes of interest for meta-analysis. Thus, I got a sample excel worksheet for meta-analysis from Jillian Dunic who just completed a metadata analysis course with UMass Boston faculty member Jarrett Brynes and tried to set up a proto-type worksheet.
Additionally, I am also working on extracting citation information from Google Scholar. The Scopus and Web of Science (WOS) are famous in its quality and reputation, but they have limitation in coverage. Thus, the Google Scholar will be a great asset to improve the completeness of our literature review coverage. It is tedious and hard to import citation information manually from the Google Scholar, so I am looking for applications which can automatically retrieve citation information.
Most applications seem to run on the Linux environment, thus I installed Linux on my unused computer and set up relevant suites for running those application. I found two candidates for the automatic citation extraction. One is Gscholar (https://github.com/venthur/gscholar) and Scholarscrap (https://bitbucket.org/fccoelho/scholarscrap). I could extract citation information with Gscholar, but it could only extract 20 citations, as it retrieve information from the first search page. I am checking whether I can tweak the code of Gscholar and also testing the Scholarscrap as well. I will share the result with our team members at my earliest convenience.