Week 4: ECSO knowledge representation and carbon cycling incorporation

This week we had two goals 1) to further define our knowledge representation structure by going through the ECSO ontology and determining which custom annotations can be systematically replaced with standard SKOS elements and 2) to improve the thematic contents of interest within the ontology related to carbon cycling.

For the first task I systematically went through the classes to establish the ‘design pattern’ for the ontology. For example, does each term have a rdfs:label, or skos:prefLabel, or both ? There were only a couple of skos labels used in the ontology and rdfs:labels were abundant. As I mentioned last week there were 52 terms for which the rdfs:label was absent. So, using the IRI link to the ontology label I added the rdfs:label to those 52 terms. While these terms lack all other annotation information the others within the onotology vary  in their notation. For example the class ‘concentration of’ has annotations of: rdfs:label, id, definition, hasOBONamespace, hasDbXref, definition_Source, has_Exact_Synonym, and inSubset; however, the class ‘Count’ has an rdfs:label, definition, definition_Contributor, definition_Source, has_Related_Synonym. While the two both have rdfs:label, definition, and definition_Source the differences between the two classes is primarily due to their origin. While ‘concentration of’ is imported from PATO (Phenotypic Quality Ontology) the class ‘Count’ is a custom ECSO term, as a result of this, PATO is on the Obofoundary website and has a OBO namespace while ECSO does not and so the term does not contain this annotation information. Within ECSO the term ‘Count’ is defined as ‘the total number counted’ while, PATO has a term ‘amount’ which has_Exact_Synonym:count it is defined as ‘the number of entities of this type that are part of the whole organism’ and does not function for DataONE’s semantic needs. The proceeding example demonstrates how I evaluated each term. In the previous case ‘count’ had no id in the annotation field and this is true for all the native ECSO terms. The addition of ECSO ids to each class is an example of one of the ways we can improve knowledge representation within ECSO. Following the analysis of the current state of the ECSO knowledge representation I have determined that the following elements should be present in all class annotations rdfs:label, id, definition, definition_Source, and definition_Contributor or created_by (if native). All other information is conditional dependent on the source that the term is imported from and the information therein (ex. has_DbXref or has_OBONamespace). I have begun to edit each term to insure they contain all pertinent information within their annotation section.

The second task this week, to improve the thematic content of ECSO with regards to carbon cycling, is less straightforward. The existing class structure within ECSO contains several carbon cycling related terms. Key among these terms are ‘carbon pool’ and ‘carbon cycling’ shown in the figures below as they are currently structured into the ECSO framework.

As you can see, the term ‘carbon cycling’ is a subclass of ‘Environmental System Process’, which is a subclass of ‘Process’ a subclass of ‘Occurrent.’ The class ‘Occurrent’ is a sibling class of the class ‘Entity’ of which ‘Carbon Pool’ is a subclass. As you can see there is a fair bit of preexisting structure with which I can begin building the carbon cycling ontology into ECSO. Now in conceptualizing the carbon-cycle the first thing that comes to mind is the classic illustration diagrams that I am sure you are familiar with, such as the one below from NASA shown below.

In the illustration we see that each ‘Entity’ – ‘Carbon Pool’ of the carbon cycle is labeled and has a ‘Occurrent’ -‘Process’ associated with it depicted by the arrows indicating carbon exchange between each ‘Entity.’ In the ECSO framework however ‘Process’ is currently classified as both an ‘Occurrent’ and an ‘Entity.’ This dual classification may seem confusing but this depends on how we define the terms ‘Entity’ and ‘Occurrent.’ In fact within ECSO ‘Occurrent’ is defined as “an entity that has temporal part and that happens, unfolds or develops through time.” And so it makes sense that the ‘Process’ is both an ‘Entity’ and an ‘Occurrent.’

Currently, ECSO is missing these arrows within the ontology. Next week, I plan to add these connections via the Object Properties ‘hasProcess’ and the inverse ‘hasComponent’. For example the ‘Atmospheric Carbon Dioxide Pool’ hasComponent ‘carbon dioxide’ and the class ‘Leaf Litter Carbon Pool’ hasProcess ‘decomposition.’ In addition to this I will be organizing the many terms I have identified as being essential for DataONE semantics concerning carbon-cycling into ECSO.


Leave a Reply

Your email address will not be published. Required fields are marked *