Exploration of Search Logs, Metadata Quality and Data Discovery: Week 4

My goal for week four was to do some exploratory data analysis (EDA), now that the data are all transformed into a system that makes them easy to query. I produced some preliminary results and figures describing the search and download events captured by the logs. I’ll go through a Continue reading Exploration of Search Logs, Metadata Quality and Data Discovery: Week 4

Exploration of Search Logs, Metadata Quality and Data Discovery: Week 3

My goals for week 3 were to collect download logs from a SOLR index, parse those logs into tokens, populate a database with the log information, and relate the download events to the search events by connecting them in time and by remote host address.  I was able to accomplish Continue reading Exploration of Search Logs, Metadata Quality and Data Discovery: Week 3

Exploration of Search Logs, Metadata Quality and Data Discovery: Week 2

For the second week of my project, my original goals were to collect download logs, parse the log events into tokens, and populate a database with the download information.  After our weekly internship call, my mentors and I decided to change things up a little bit. The purpose of building a Continue reading Exploration of Search Logs, Metadata Quality and Data Discovery: Week 2

Exploration of Search Logs, Metadata Quality and Data Discovery: Week 1

My name is Ed Flathers, and I’m the DataONE Summer Intern on the project, “Exploration of Search Logs, Metadata Quality and Data Discovery.”  This project is largely focused on data mining and analysis of the DataONE search logs, download logs, and quality reports; many of my products will be program Continue reading Exploration of Search Logs, Metadata Quality and Data Discovery: Week 1

Welcome to the 2017 Summer Internship Open Notebooks

We are excited to begin work with our 2017 cohort of summer interns across a range of projects.  Information about the projects can be found on our internship description page. Our interns will start recording their activities, experiences and results in this space starting May 2017.