DataONE has been around since 2010.
It’s an NSF project so it’s continually evaluated for performance.
One metric could be reach and engagement on social media, as a measure of awareness about DataONE.
Since I’ve looked at open science sentiment analysis before, I volunteered to poke around a bit on topsy (www.topsy.com) and see what I can find.
First, a search of tweets for DataONE, by month.
There have been 31 tweets with the phrase “DataONE” in the past 30 days.
http://topsy.com/s?q=DataONE&window=m&sort=date
However, this includes various products named “DataONE” which includes a software vendor based out of southeast Asia called DataONE and a broadband package.
Therfore, let’s take a look at tweets that mention DataONE’s Twitter handle:
Interestingly, for the time period December 31st to January 30, there have actually been 81 tweets mentioning @DatONEorg.
http://topsy.com/analytics?q1=%40DataONEorg&via=Topsy
Repeating this search for “All-time” yields the following URL:
http://topsy.com/s?q=%40DataONEorg&window=a&type=tweet&sort=date
I assume there is a limit for the free version.
I navigated to the last page:
http://topsy.com/s?q=%40DataONEorg&window=a&type=tweet&sort=date&offset=90
The results are sorted by newest. The oldest tweet available is dated two months ago.
I changed the “offset” key to 200.
The oldest tweet available is dated three months ago.
http://topsy.com/s?q=%40DataONEorg&window=a&type=tweet&sort=date&offset=200
I changed the “offset” key to 900. The oldest tweet available is a year ago.
http://topsy.com/s?q=%40DataONEorg&window=a&type=tweet&sort=date&offset=900
I changed the “offset” key to 999. No data loaded.
I change the “offset” key to 1,000. No data was retrieved.
I re-loaded the http://topsy.com/s?q=%40DataONEorg&window=a&type=tweet&sort=date&offset=900
The oldest tweet was from Carly Strasser:
https://twitter.com/carlystrasser/status/245276554333151232
It was dated September 10, 2012.
There are 10 tweets per page. The pages are accessible in multiples of ten.
That is, working backwards from 900, previous would be 890.
This explains why 999 did not work.
Let’s try again with 990.
Success!
http://topsy.com/s?q=%40DataONEorg&window=a&type=tweet&sort=date&offset=990
The oldest tweet is from 2 years ago but Geoff Barker @geoffmuse
It is dated July 29, 2012.
Using 1000 as a key failed. What happens if we use 1010?
http://topsy.com/s?q=%40DataONEorg&window=a&type=tweet&sort=date&offset=1010
No tweets found.
Therefore it appears the farthest back in time we can go with Topsy’s (free) analytics is July 29, 2012.
We know that DataONE is older than 2012.
How long has the @DataONEorg twitter account been available?
Looking at the analytics service from “twtrland” we can see metrics.
http://twtrland.com/profile/DataONEorg
@DataONEorg has been on Twitter since Thursday, November 18, 2010.
For data available 3 months ago:
There are 202 followers
52% are female
73% are from the United States.
It’s possible to sort by number of followers.
http://twtrland.com/profile/DataONEorg/followers
“We analyze all the content people share and how their audience reacts to it, to find their influential skills”
The skills reported are as follows:
Science, Research, Universities, Scientists, Community, Biologists, Biology, Management, Genomics, Librarians, Library, Publishing, Technology, Bioinformatics, Digital, Education, Ecology, Professors, Writers, Open Access.
There are 84 replies per 100 t
There are 28 RTs per 100t
37% of tweets are links
There are .01 tweets per day
The demographics information is interesting:
Some of the advanced analytics are available only by using the “pro version” – there is a free trial option, however this is something Community Engagement & Outreach coordinator Amber Budden would only have access to. I’ll let her know in case she wants to try it out.
At any rate the point of looking at Twitterland was to pinpoint when DataONEorg went live.
I now have a date of
Thursday, November 18, 2010.
I can go back to Topsy and try a range of dates to overcome the problem I encountered with being unable to seek tweets past 990, the tweet at July 29, 2012.
November 1, 2010 to July 1, 2012
Now I want to try November 1, 2010 to November 1, 2011
http://topsy.com/s?q=%40DataONEorg&type=tweet&sort=date&mintime=1288612824&maxtime=1320148851
I wish I could make sense of the numbers here (1288612824) and (1320148851) but for now I just need to accept on faith that they correspond to November 1, 2010 and November 1, 2011.
The “most recent” tweet appears to be:
October 31, 2011.
https://twitter.com/mcdonald/status/131111854327070720
The oldest Tweet appears to be:
Beyond 150.
Let’s try an offset key of 300
No tweets found.
Let’s try an offset key of 220.
No tweets found.
Let’s try and offset key of 200.
No tweets found.
180.
None
170.
Three tweets found.
The very first tweet (that was not spam) in Topsy mentioning @DataONEorg was from Heather Piowar.
https://twitter.com/DataONEorg/status/47691665087016960
So, working backwards from 170 pages of mentions at 10 tweets per page for the 365 day period between November 1, 2010 and November 1, 2011, there were at least 1,700.
I say “at least” because in the case of the “first tweet,” 4 other users “re-tweeted” the same tweet, and that was not reflected in the numbers from Topsy.
So, this appears to be a method for poring over the history of tweets.
How to systematically extract that data is another topic, which I will address in a separate post.