Last week, the preservation and metadata work group met in Chicago to flesh out an exact specification for our online, crowd-sourced metadictionary. We discussed in detail the features that we’ll implement for the prototype, as well as future add-ons that will follow the internship. I began coding last week and progress is steady.
In the first week, I spent a lot of time thinking about a precise metadata ontology. This exercise was valuable in that it gave me a chance to think about how a metadictionary could be used. I also had the chance to think about the computational space we’d be creating and how it could benefit research scientists. However, our meeting in Chicago revealed that my original model was far too restrictive. My model assumed the existence of a predefined set of relations. Relationships between terms will certainly be an important element of our metadictionary, but we don’t want to define them ahead of time; relations should evolve in the same way that terms evolve in the social ecosystem.
And now a brief overview of the system we hope to build. We’re creating an online service that allows researchers to look up the correct terms (and definitions for those terms) to tag their data sets. Of course, this is not the only application; any form of digital media should be covered by our metadictionary. Term discovery is facilitated not only by text searches on the definition, but by studying the relationships between terms. The core feature of this dictionary is that it’s entirely crowd-sourced. Users can contribute to the metadictionary in the following ways:
- Propose a new term that the user believe is not covered in the dictionary,
- Comment on and discuss other proposed terms,
- vote on others’ terms (up or down, and
- propose a relationship between a pair of terms to improve discovery (is-a, qualified-by, etc.).
The goal is to evolve a set of metadata terms that are stable and upon which the majority of the user community agrees. We’re still working through the exact mechanics of how terms flow from the vernacular (community debate) to the canon (stable term set), but we’ve agreed that to start, term voting will be reputation-based. (See link below for more details.)
We’re calling our metadictionary SeaIce. (It’s a little hard to explain why.) Much of the core functionality has already been implemented. I’ve completed the database components, the essential system API, and basic usability on the front-end. A major challenge of this project for me is that I don’t have formal web-development experience and have had to dive into tools I’ve never used before. Luckily, with the help of some folks in our working group, things are moving steadily and smoothly. Early next week, I’ll be able to deploy the first development stage online … I’ll be sure to post a link to it then.
For now, you can take a gander at our source code and give the HTTP server a run: http://github.com/cjpatton/seaice. Thanks for reading!