Final Results

So this week I finished the subtopic matching software and after some testing, ran it. Basically my goal was to answer the question, does coverage decrease significantly if you remove “topically” unrelated documents from the corpus? I found that surprisingly yes. While the SWEET ontologies are small, and attempt to Continue reading Final Results

SWEET Ontologies and Coverage

This week I finished the coverage analyzer tool and have run tests on over 200 popular ontologies (i.e., the SWEET ontologies)! As this is the main part of the project (the coverage tool) and getting this data is over a week ahead of time, I’m quite pleased. The coverage tool Continue reading SWEET Ontologies and Coverage

Ontology Generation and Coverage

This week I finished the code and test cases for the automatic ontology generation I started last week, and began work on the coverage algorithm. The now finished ontology generation comes with a readme, the ability to add individual words, or merge two existing ontologies. In this way, it should Continue reading Ontology Generation and Coverage

Part of Speech Tagger, ontology generation

This week my main goal was to get a Part of Speech (PoS) tagger up and running. After some searching and testing I decided to use the Natural Language Toolkit ( While it has to be installed (as opposed to running in a jar or python egg) it runs quickly Continue reading Part of Speech Tagger, ontology generation

Gathering dataset

This week my focus was on meeting with my mentors, understanding my specific project requirements and gathering my datasets.  My initial work was about creating a week-by-week plan to create a meaningful, generalizable ontology coverage tool for OWL ontologies. I spent some time looking over existing scripts and testing them Continue reading Gathering dataset