Week 1 – Project Overview and Related Work

The goal of the Next Generation Data Environment is to provide a browser-based graphical interface for conversion of CSV files into RDF linked data format. Users will be able to define a schema for their conversion by dragging and dropping classes and properties from OWL ontologies directly onto a CSV data table loaded into the environment.

This week, I focused on familiarizing myself with the CSV2RDF4LOD RDF software, which acts as the backend for the graphical interface I will be working on. I have also been investigating other existing software tools for semantic annotation and linked data conversion.

The CSV2RDF tool currently exists as an extension to RPI’s SemantEco system, and so I have been reviewing the basic structures that are already in place. This includes both the Javascript graphical user interface elements, as well as the documentation for the CSV2RDF4LOD converter tool. I sat in on a conversation between two of my mentors and another student while they worked through conversions for one particular data set. In addition to demonstrating the current process for conversion step-by-step, this experience also highlighted the parts of the conversion that were difficult, and could be streamlined in a graphical interface version of the converter software.

My mentor also provided links to existing software for semantic annotation and converting to RDF format. I have been testing out Anzo Express (http://www.cambridgesemantics.com/products/anzo-express); the RDF extension to Google Refine (http://refine.deri.ie/); and Morpho (http://knb.ecoinformatics.org/morphoportal.jsp). Reviewing tutorials and documentation for these programs has helped me understand what capabilities are already available, as well as the advantages and limitations of each.

This week, I developed a plan for what features and capabilities we want to include in the graphical user interface. Next week, I plan to draw up some storyboards in order to plan the actual development of these facets of the interface environment.

