Having concluded last week that apart from Falcon-AO, it is imperative to explore other approaches/algorithms, I started working on AROMA (Association Rule Ontology Matching Approach). It is a hybrid, extensional and asymmetric matching approach which allows to match not only equivalence relations but also subsumption relations between
entities, i.e. classes and properties, issued from two OWL ontologies. AROMA relies on the following assumption: An entity A will be more specific than or equivalent to an entity B if the vocabulary (i.e. terms and also data) used to describe A, its descendants, and its instances tends to be included in that of B.
It is divided into three successive main stages:-
- pre processing stage: Here the data is prepared by constructing a set of relevant terms and/or datavalues for each class and property.
- second stage: Here the association rules between entities are discovered. The association rule a -> b, is a subsumption relation from the antecedent entity toward the consequent one. (refer to the assumption in the first para). For example, the binary rule car → vehicle means: ”The concept car is more specific than the concept vehicle”.
- post processing stage: The alignments produced in the previous stage are enhanced by… deduction of equivalence relations, suppression of cycles in the alignment graph, suppression of redundant correspondences & selection of the best correspondence for each entity. This stage uses equality-based and string similarity-based matchers.
AROMA has participated in various iterations of oaei for several years on many tracks such as benchmark, anatomy and conference with considerably good results.
I applied this algorithm onto some OWL ontologies downloaded from the bioportal & the esip semantic web portal. The resulting alignments produced were interesting. The algorithm necessitates parsing of two ontologies at a time and produces a file in .rdf format which contains alignment relations between the preprocessed terms(set made in the first stage). Ontologies used are: sweet, oboe-sbc, swo, biodiversity. Although the alignments produced had the expected relations, but were mostly rid with warnings and errors, some times inexplicable. Only the biodiversity ontology seems to work fine with this algorithm, with the remaining producing errors/exceptions during the preprocessing stage like ava.lang.NullPointerException, NotFoundException.
Possible reasons could be that the tested ontologies were broken or there were problems with Jena (com.hp.hpl.jena.util.FileManager) or maybe AROMA is not good enough to produce correct alignments for these ontologies(but this seems a bit far-fetched).
So, continuing with this line of thought I intend to explore the said algorithm more, possibly testing it on few other ontologies even if not related to the earth & environment domain to check the consistency of the algorithm. Since it accepts only OWL format is input the no. of ontologies become limited. Also I would test the algo on parts of the sweet ontology – thoughtful suggestion by the mentor.
Apart from this, I also played around with Falcon-AO and removed some errors from the source code. Falcon does boast of a nice GUI and I hope to make it work pretty soon, possibly.