Archiving Citation Libraries for Open Notebook Science

In May when my open notebook was first set up, I considered the idea of “media management” in a post called “Open Notebook Science – Media Management.”

I touched on Mendeley, Figshare, and Flickr as potential ways of sharing media.

I now want to point out another post by Carly Strasser under the DataONE for Librarians category entitled “Researcher Needs Assessment Bibliography” because it includes another tool I had not considered called “Cite U Like.”

Either comment on this blog post or join CiteULike and add a reference to the RDMneeds Group Library.

I’m thinking of this because yesterday I made some notes about “I like this” in regards to citations. It’s worthwhile to copy and past things out to reference later – but in terms of accessing it I’d prefer to have something more dynamic, open, and broadly accessible.

It does concern me that something like CiteULike could go the way of Connotea, Magnolia, or Google Reader. A more complex story is that of the social bookmarking service “del.icio.us” or “Delicious” promising “tastiest bookmarks on the Web.” Ultimately Yahoo! decided that tasty bookmarks were no longer its strong suite; they sold Delicious and all the bookmarks therein. However I can attest that while the bookmarks were preserved, not all the functionality was, or even all of the bookmark: the new site had a tighter character limit on descriptive text. (I moved my bookmarks to Diigo soon after Delicious’ uncertain future was announced)

The point of this is to say that an open notebook science workflow needs to account for the possibility / likelihood that an online media management system might end up going extinct – or at least undergo a “loss” mutation.

So in setting up something like the “Researcher Needs Assessment Bibliography” group at <http://www.citeulike.org/group/18394> or some other document libraries set up, for example, by 2011 DataONE summer intern Jonathon Carlson who created a handful of document libraries on Mendeley – see for example <https://notebooks.dataone.org/data-reuse/links-to-mendeley-data/> there needs to be a mechanism for “hands free” importing of bookmarks.

As an aside – this is actually fascinating in how closely aligned to the CiteULike group this is – there is a “Data Management For Librarians” group on Mendeley which former DataONE summer intern Carlson is also a member.

Regarding exports of bookmarks / bibliographies on autopilot – CiteULike provides the following export options – even to someone like myself who is not logged in:

RIS	Export as RIS which can be imported into most citation managers
BibTeX	Export as BibTeX which can be imported into most citation/bibliography managers
PDF	Export formatted citations as PDF
RTF	Export formatted citations as RTF which can be imported into most word processors
Delicious	Export in format suitable for direct import into delicious.com.
Formatted Text	Export formatted citations as plain text

I will say that Delicious did a good job at providing export options. HTML, JSON, RSS – lots of ways of exporting, which ultimately made it easy to export my thousands of bookmarks from Delicious to Diigo. However it still required my intervention. And to be fair my bookmarks are still available – in fact looking at my old bookmarks provides something of a “trail of crumbs” reflecting ideas that I encountered and projects I was working on at specific times – very much like an open notebook.

For CiteULike, there is a feed:

feed://www.citeulike.org/rss/group/18394

But how you export that feed is certainly subject to the same pitfalls as how you generate / store your original citations. If you felt smug archiving RSS content via Google Reader, for example, I can’t recall the options Google gave you to “take out” your data prior to Google shutting that down (I still don’t understand why they can keep mounds of e-mail but not mounds of kilobyte-sized XML output).

Perhaps something such as the “tag team” aggregator would be a solution? http://tagteam.harvard.edu/hubs/3#hub_3=2 – but notice that 14 of the 15 “input” feeds for this “open access” themed aggregation are from now-defunct Connotea – with 1 from the “nearly nixed” Delicious.

There appears to be a nice preservation of some key metadata (title, summary, links, tags). http://tagteam.harvard.edu/hub_feeds/928/feed_items/305204

But again, this is born digital content and what happens when funding runs out or the proprietors lose interest?

The “Tag Team” site is semi-preserved on the Internet Archive – https://web.archive.org/web/20130113083449/http://tagteam.harvard.edu/hubs/3 From CiteULike – the list of groups is not archived: https://web.archive.org/web/20081121092542/http://www.citeulike.com/groups/browse

To be very “meta” about it – what happens to this open notebook? Certainly the posts are of interest to me.

Here’s an older post captured by Google:

http://webcache.googleusercontent.com/search?q=cache:tgHOxN5g6lUJ:https://notebooks.dataone.org/data-science/open-notebook-science-tagging-posts-concerning-correspondence/+&cd=20&hl=en&ct=clnk&gl=us&client=safari

And The Internet Archive only recently (May 2013) began saving notebooks.dataone.org – and rather poorly at that – see <https://web.archive.org/web/*/https://notebooks.dataone.org/>.

There are some tools out there that can automate the process of open science – here’s a presentation I came across today. http://ropensci.github.io/workshops-oxford-2013-09/00-introduction/intro_slides/#intro (which, perhaps unsurprisingly, is linked to open notebook science advocate Carl Boettiger).

However the key thing I’m realizing here is that along with the platform or tools for sharing – there also needs to be attention on the part of the researcher – or perhaps the researcher in concert with the librarian – to the process for archiving open science output.

About Tanner Jessel

Leave a Reply Cancel reply