Week 4 – new site features, beginning crowd-sourcing

The SeaIce metadictionary is now complete with a robust term/definition search engine and a commenting feature. We are now ready to begin populating the database with real terms. There are potentially a number of interface design improvements that could be implemented; fortunately, there should be time towards the end of July to address some of these. As always, your feedback is appreciated (seaice.herokuapp.com). There are essentially two more features to incorporate: one, enhance term discovery through adding term relations, as well as introduce a sound ontological structure to the metadictionary; two, crowd-sourcing in order to evolve a stable, community-defined term set. The latter is the critical next step. In fact, term relations will themselves be crowd-sourced. As such, I’d like to talk a little about what I will implement for the prototype system.

The goal of crowd-sourcing is to rank terms in search queries such that terms/definition that the majority of the community agree upon appear at the top. Along with its definition, examples, and ownership properties, each term in the metadictionary has an associated score. This integer value is calculated from votes cast by users and is utilized in the search ranking algorithm to determine the relevance of a term to the query. A term’s class, vernacular, canonical (stable), or deprecated, is also used to rank stable terms higher.

Any user logged in to the website has the opportunity to vote a term up or down. We determine the weight of their vote based on their reputation in the community. For the prototype system, this will simply be an integer value that we seed for initial users. In the future, reputation will be determined by the contributions of the user. When a user proposes a term, it resides initially in the vernacular class. It is promoted to the canon based on community feedback (How this will work hasn’t been completely hammered out.) Terms with poor community support will be demoted to the deprecated class.

There are still some missing details in the crowd-sourcing model Next week, I will work on an exact specification for this system. A couple of the main questions I’ll focus on are:

1. When will it be possible to vote on a term and when will the owner have an opportunity to refine its definition? Reset votes? (We could make this a continuous process and notify users when terms they are tracking have been modified by the owner.)

2. How to quantify term stability in order promote/demote terms? I suspect it will be necessary to look at not only the term’s instantaneous score, but its rate of change over time.

I’ll blog my progress on this problem early next year. In the meantime, I’ll implement the mechanics of this system. Thanks for reading!

Leave a Reply Cancel reply