My name is Elizabeth Olson, and I am the DataONE Summer Intern on the project, “Improving DataONE’s Search Capabilities Through Controlled Vocabularies.” This project focuses on improving researchers search capabilities within the DataONE data repository. In order to improve the recall and search precision of the data objects available through the DataONE site I will be utilizing Protégé 5 in order to build on the existing “Ecosystem Ontology” (ECSO). Each Friday I will be posting my progress here and you can also check out my GitHub repository where I will be uploading the ontologies as they are developed.
This week I had the chance to travel to Santa Barbara in order to meet with the NCEAS team and my mentors Mark Schildhauer and Julien Brun. During my visit to NCEAS I was able to discuss the project goals with my mentors and we developed a timeline for the coming weeks. Since this project is funded through the Arctic Data Center (ADC) we will be focused on creating ontologies to improve the ADC search capabilities. Early this week I spent time becoming familiar with the existing data products within the ADC archives and identifying potential areas for vocabulary improvement. I also spent time completing several tutorials on the Protégé 5 and GitHUB programs. During our discussions this week we have decided to begin the project by focusing on carbon-cycling related vocabularies specifically. In order to establish these ADC carbon-cycling vocabularies, I began querying the DataONE framework via the R dataone package in order to identify gaps in data discoverability. Additionally, I created a GitHUB branch of the existing ECSO ontology where I will be making edits to the current vocabularies and adding new ones.
As a team we covered a lot of ground this week and are off to a great start. I learned a lot this week! Next week, I will be working on conceptualizing the carbon-cycling vocabulary given the DataONE database query results.