This week, I took at first pass at drafting a categorization of common errors ecological researchers make when publishing their data. The base of this categorization scheme was five classes of ecological metadata [Michener et al., 1997]:
- General data set descriptions
- Research description
- Accessibility information
- Data structure description
- Supplemental information
This provided the basic structure, and I fleshed it out with elements from the Federal Geographic Data Committee (FGDC) metadata standard [FGDC, 2000], the ISO 19115 standard for metadata (which built on the prior FGDC standard) [FGDC, 2011], and Ecological Markup Language (EML) [LTER, 2011].
Once I had a basic hierarchy, I started marking up reviews of data papers submitted to the Ecological Society of America (ESA). Not surprisingly, once I started actually trying to use the draft error categorization, I realized there were some categories that were not relevant, and some that were missing from my draft. I analyzed five data paper reviews, modifying the error categories along the way. The next step is to revise the error categorization scheme based on input from my DataONE mentors, William Michener and Robert Cook. Then, I will reanalyze the data paper reviews with the new categorization scheme.
- Federal Geographic Data Committee. (2000). Content Standard for Digital Geospatial Metadata Workbook Version 2.0. Federal Geographic Data Committee. Washington, D.C.
- Federal Geographic Data Committee. (2011). Preparing for International Metadata. Federal Geographic Data Committee. Washington, D.C.
- Long Term Ecological Research Network. (2011). EML Best Practices for LTER Sites Version 2.0. Long Term Ecological Research Network. Albuquerque, N.M.
- Michener, W.K., et al. Nongeospatial metadata for the ecological sciences. Ecological Applications 7, 1 (1997), 330-342.