Potential starting points

I have done a cursory exploration of some potential starting points for this project.  Here’s my notes… which are a bit scattered. Let me know if you have questions.

Ideas on where to start:

  • Literature search on other estimates of data
  • Definitions of data
  • NEON: how they define data, estimate how much data they will produce, etc.
  • Think about timeline: how much will amount of data change over time? What was it like in the past?
  • Individual researchers: how much data do they generate? How much of the data is published versus how much is generated?
  • Search for databases, database capacity
  • Google search for definitions of data
  • Comp sci textbooks?
  • Data management (or database management) textbooks?
  • Ecology/envi bio textbooks?
  • from libraries? (Dartmouth Example)

Websites to check out:

Papers to check out:

  • [Bell 1994]. Alan Bell; IBM Academy Digital Library Workshop (Sept 12-13, 1994).
  • [Census 1995]. United States Census Bureau Statistical Abstract of the United States Government Printing Office (1995).
  • [Fargion 1996]. G. S. Fargion, R. Harberts, and J. G. Masek An Emerging Technology Becomes an Opportunity for EOS From the online file; see the URL: http://ecsinfo.hitc.com/cdwg/datamining/overview.html.
  • [Landauer 1986]. T. K. Landauer; “How much do people remember? Some estimates of the quantity of learned information in long-term memory,” Cognitive Science,10 (4) pp. 477-493 (Oct-Dec 1986).
  • [Louis 1996 ]. Steve Louis Cooperative High-Performance Storage in the Accelerated Strategic Computing Initiative 5th NASA Goddard Conference on Mass Storage Systems and Technologies (Sept. 17-19, 1996 ). As reported by Ron Van Meter, http://www.isi.edu/~rdv/conferences/goddard96.html .
  • [Markoff 1997]. John Markoff; “When Big Brother is a Librarian,” The New York Times pp. 3, sec. 4 (March 9, 1997).
  • [Mauldin 1995]. Matt Mauldin, “Measuring the Web with Lycos,” Third International World-Wide Web Conference, April 1995.
  • [Mills 1996]. Mike Mills; “Photo Opportunity,” Washington Post pp. H01 (January 28, 1996)
  • [Radding 1990]. Alan Radding; “Putting data in its proper place,” Computerworld pp. 61 (August 13, 1990).
  • [Tenopir 1997]. Carol Tenopir, and Jeff Barry; “The Data Dealers,” Library Journal pp. 28-36 (May 15, 1997).
  • [UNESCO 1995]. UNESCO Statistical Yearbook Bernan Press (1995).
  • [Wells 1938]. H. G. Wells World Brain Methuen (1938).
  • The World’s Technological Capacity to Store, Communicate, and Compute Information (Martin Hilbert and Priscila López) Science 1 April 2011: 60-65.Published online 10 February 2011 [DOI:10.1126/science.1200970]

These papers are all on the server for you to access in a folder called “Reprints”. Their titles are [Last name of first author][Last two numbers of publication year]. Eg. Bollier10.pdf

  • [1]    D. Bollier. The Promise and Peril of Big Data. Technical report, The Aspen Institute, 2010.
  • [2]    S. Carlson. Lost in a sea of science data. The Chronicle of Higher Education, 52(42):A35, 6/23/2006 2006.
  • [3]    C. Doctorow. Big data: Welcome to the petacentre. Nature, 455(7209):16–21, Sept. 2008. PMID: 18769411.
  • [4]    P. B. Heidorn. Shedding light on the dark data in the long tail of science. Library Trends, 57(2):280–299, 2008. Volume 57, Number 2, Fall 2008.
  • [5]    D. Howe, M. Costanzo, P. Fey, T. Gojobori, L. Hannick, W. Hide, D. P. Hill, R. Kania, M. Schaeffer, S. S. Pierre, S. Twigger, O. White, and S. Y. Rhee. Big data: The future of biocuration. Nature, 455(7209):47–50, 2008.
  • [6]    C. Lynch. Big data: How do your data grow?  Nature, 455(7209):28–29, 2008.
  • [7]    O. J. Reichman, M. B. Jones, and M. P. Schildhauer. Challenges and Opportunities of Open Data in Ecology. Science, 331(6018):703–705, February 2011.
  • [8]    V. S. Smith. Data publication: towards a database of everything. BMC research notes, 2(1):113, 2009.

Leave a Reply

Your email address will not be published. Required fields are marked *