In order to make citizen science data more discoverable and usable, one strategy is to publish the data to a well recognized, respected and open repository such as the Global Biodiversity Information Facility (GBIF). GBIF records conform to Darwin Core Standards, community generated standards that ensure that biodiversity metadata is described by a standardized vocabulary. GBIF developed the Integrated Publishing Toolkit (IPT) to facilitate data publication and ensure that all records conform to Darwin Core Standards by mapping terms from the source dataset (e.g. a csv file) to Darwin Core terms.
This week, I decided to explore the IPT to determine how Citsci.org might be able to use it in the near future to publish their citizen science datasets to GBIF. In order to download and run IPT, I first had to run Tomcat (a servelet container – whatever that means) on my computer. Having never heard of Tomcat before, it took me a whole day to figure out how to install and run it. Basically, i had to update my Java Runtime Environment and make a few custom tweaks before I was able to open up a local instance of Tomcat using the Terminal application on my Mac.
I started with the 2.2.1 version of IPT, but i was told by Laura Russell of Vertnet to revert to the older 2.1.1 version because the newest one still has many bugs. Laura was generous enough to walk me through the IPT installation and show me how to manage resources (dataset) and organizations and enter metadata. My next steps are to explore the IPT further with colleagues from Citsci.org, with a goal of using IPT to upload citizen science data to GBIF by this fall.