{"id":1138,"date":"2013-05-22T22:26:12","date_gmt":"2013-05-22T22:26:12","guid":{"rendered":"https:\/\/notebooks2.dataone.org\/?p=1138"},"modified":"2013-05-23T18:46:04","modified_gmt":"2013-05-23T18:46:04","slug":"on-tagging","status":"publish","type":"post","link":"https:\/\/notebooks.dataone.org\/data-science\/on-tagging\/","title":{"rendered":"On Tagging"},"content":{"rendered":"

For the purposes of setting up this open notebook, I have an assigned category (data science).<\/p>\n

The category collects all of my unique blog entries into one collection:<\/p>\n

To organize information further, wordpress allows for the use of tags. \u00a0Tags are separated with commas.<\/p>\n

Unfortunately, I do not have a controlled vocabulary from which to work.<\/p>\n

As a student of information science, obviously I am highly interested in controlled vocabularies. In fact I am very interested in Automated Metadata Annotation – as is DataONE in general.<\/p>\n

A project that has caught my attention and might be potentially useful is <https:\/\/www.nescent.org\/sites\/hive\/Main_Page><\/p>\n

You may notice nescent in the URL – Nescent is the National Evolutionary Synthesis Center. \u00a0The Hive Project is also funded by the Institute for Museum and Library Science (IMLS) and has some involvement with the UNC iSchool.<\/p>\n

Helping Interdisciplinary Vocabulary Engineering (HIVE) is an\u00a0IMLS funded project<\/a>\u00a0involving the\u00a0Metadata Research Center (MRC)<\/a>\u00a0at the School of Information and Library Science, University of North Carolina at Chapel Hill, and the\u00a0National Evolutionary Synthesis Center (NESCent)<\/a>\u00a0in Durham, North Carolina. The two and a half year project is demonstrating the HIVE model for dynamically integrating multiple controlled vocabularies. A recent extension includes\u00a0HIVE-ES (Espa\u00f1a)<\/a>\u00a0HIVE in Spanish.<\/p>\n

HIVE is an automatic metadata generation approach that dynamically integrates discipline-specific controlled vocabularies encoded with the\u00a0Simple Knowledge Organisation System (SKOS)<\/a>, a World Wide Web Consortium (W3C) standard. HIVE will assist content creators and information professionals with subject cataloging and will provide a solution to the traditional controlled vocabulary problems of cost, interoperability, and usability.<\/p><\/blockquote>\n

What I like about HIVE is it allows me to search across vocabularies, including the controlled vocabulary formerly established by the NBII Program and Cambridge Scientific Abstracts and now called the USGS Biocomplexity Thesaurus.<\/p>\n

What’s neat about it is I can copy and paste my text from this blog into a word document, then run it through HIVE using a controlled vocabulary of my choosing, and it gives me some keywords, including “data,” “metadata,” and “vocabulary.” I’ll use “metadata” and “vocabulary” on this post, starting to build my own controlled vocabulary.<\/p>\n

\"Agrovoc<\/a><\/p>\n

Figure 1. Selecting available thesauri, and choosing text input (in this case, the URL to this blog, albeit prior to adding results of this discussion).<\/strong><\/p>\n

\"Key<\/a><\/p>\n

Fig. 2. Output from Keyphrase Extraction Algorithm<\/a> Tool.<\/strong><\/p>\n

Try it out at\u00a0http:\/\/hive.nescent.org\/indexing.html<\/a>.<\/p>\n

Would be nice if this were all automated, but I’ll take what I can get.<\/p>\n

Obviously there is a bit of humor in seeing the AgroVoc agricultural thesaurus <\/a>(Agricultural Vocabulary) suggest I’m talking about bee hive management, or the USGS Biocomplexity Thesaurus<\/a> suggestion that I’m interested in Electric Generators (as opposed to keyphrase generators), however, this blog entry is perhaps not the best text to process for the KEA algorithm to work properly.<\/p>\n

For my Masters thesis, I am optimistic about incorporating the HIVE tool into my research into ontologically derived metadata annotation.<\/p>\n

 <\/p>\n","protected":false},"excerpt":{"rendered":"

For the purposes of setting up this open notebook, I have an assigned category (data science). The category collects all of my unique blog entries into one collection: To organize information further, wordpress allows for the use of tags. \u00a0Tags are separated with commas. Unfortunately, I do not have a Continue reading On Tagging<\/span>→<\/span><\/a><\/p>\n","protected":false},"author":35,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[12],"tags":[140,21,56,139],"_links":{"self":[{"href":"https:\/\/notebooks.dataone.org\/wp-json\/wp\/v2\/posts\/1138"}],"collection":[{"href":"https:\/\/notebooks.dataone.org\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/notebooks.dataone.org\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/notebooks.dataone.org\/wp-json\/wp\/v2\/users\/35"}],"replies":[{"embeddable":true,"href":"https:\/\/notebooks.dataone.org\/wp-json\/wp\/v2\/comments?post=1138"}],"version-history":[{"count":6,"href":"https:\/\/notebooks.dataone.org\/wp-json\/wp\/v2\/posts\/1138\/revisions"}],"predecessor-version":[{"id":1140,"href":"https:\/\/notebooks.dataone.org\/wp-json\/wp\/v2\/posts\/1138\/revisions\/1140"}],"wp:attachment":[{"href":"https:\/\/notebooks.dataone.org\/wp-json\/wp\/v2\/media?parent=1138"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/notebooks.dataone.org\/wp-json\/wp\/v2\/categories?post=1138"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/notebooks.dataone.org\/wp-json\/wp\/v2\/tags?post=1138"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}