Week 8: DataONE MetaData parser Application

Hi All, This blog is in follow-up with my earlier blogs for the Project 4: Extending Libmagic for Identification of Science Resources. This week was very fruitful and we were able to resolve most of our design and development issues for the final application. The application developed is in its final Continue reading Week 8: DataONE MetaData parser Application

Week 6: Parser, Metadata Mapper Using Apache Tika

Hi All, This blog is in follow-up with my earlier blogs for the Project 4: Extending Libmagic for Identification of Science Resources. After resetting our goals for rest of the project in the previous week. The goal is to extract metadata from different file formats using Apache Tika. Since we want Continue reading Week 6: Parser, Metadata Mapper Using Apache Tika

Week 5: Parser in Apache Tika for DataONE file Format.

Hi All, This blog is in follow-up with my earlier blogs for the Project 4: Extending Libmagic for Identification of Science Resources. In this week, we shared our progress with other developers by giving a short demo. We shared the working of file command and Apache Tika for custom detection of Continue reading Week 5: Parser in Apache Tika for DataONE file Format.

Week 4: Creating Parser in Apache Tika for onedcx file format

Hi All, This blog is in conjunction with my earlier blogs for the Project 4: Extending Libmagic for Identification of Science Resources. Continuing from the last week, we explored Apache serve functionality for detecting the Custom mime types for the DataONE file format. The httpd.conf file of the server is Continue reading Week 4: Creating Parser in Apache Tika for onedcx file format

Week 3: Custom mimetypes/magic file for the DataONE file formats for identification using Apache Tika/Apache web server

Hi All, This blog is in conjunction with my earlier blogs for the Project 4: Extending Libmagic for Identification of Science Resources. In the last week we were able to create the magic file for the file command and the repository admins of it also accepted and committed the changes in the Continue reading Week 3: Custom mimetypes/magic file for the DataONE file formats for identification using Apache Tika/Apache web server

Week 2: Created, tested and Committed magic file with the Libmagic library.

Hi All, This is the second week of my internship, and below are the tasks that were completed during this week. Adding the patterns for the rest of the file formats into the dataone magic file. Continuing the work from the last week, we were able to create additional patterns for Continue reading Week 2: Created, tested and Committed magic file with the Libmagic library.

Week1 – What is file command, libmagic library and how they work.

Hi All, My name is Pratik Shrivastava and I’m the intern working on the Project 4: Extending Libmagic for Identification of Science Resources. The goal of this project is to extend the capabilities of the Linux (or equivalents on OS X and Windows) file command to allow automatic identification of Continue reading Week1 – What is file command, libmagic library and how they work.

Welcome to the 2018 Summer Internship Open Notebooks

We are excited to begin work with our 2018 cohort of summer interns across a range of projects.  Information about the projects can be found on our internship description page. Our interns will start recording their activities, experiences and results in this space beginning May 2018.