Title:Archiving Flickr and Other Websites of Interest to Museums
Authors:Ryan Donahue, Aaron Cope
Publication:MW2012: Museums and the Web 2012

The digital turn, whether it be in photography, motion pictures, literature or other persuits, has forever changed the ways and means by which museums collect, interpret and dessiminate objects.  Historically, museums have dealt primarily with objects that, in some way, are intuitiviely graspable from observation.  Arrowheads are pointy, photographs are plainly seen, and writing read (with varying degrees of difficulty).   Digital information, on the other hand, is many times more difficult to intuitively grasp, and in many cases impossible to intuitively grasp.  This has lead to a shifting approach in collection management: collect materials before they are scattered to the hard drives and floppy discs of history.  With such a shift in approach, museums must become adept at preserving digital objects and ephemera in their original context, in many cases, a website.  We examine one site of particular interest and complexity: Flickr.   Preserving Flickr has long been a popular subject of conversation among Flickr staff, alumni, and the museum community.  Various tools and first steps have been taken, but no one to date has addressed the seemingly impossible task of preservation. The authors of this paper are very familiar with its subject.  Aaron Straup Cope is a Flickr alumni, and Ryan Donahue is the Manager of Information Systems at George Eastman House, an early partner institution of the Flickr Commons.  Through our own knowledge and consultation with museum professionals, we shall address:

  • * The expectations of the community, that is to say, our assertions about the quality and characteristics of the information that is needed for preservation, which data is nice to have, and what is unlikely to be particularly important in the future, or even confusing.
  • * The state of Flickr today:  What preservation mechanisms exist, the availability of programatic access to data via the Flickr API, and any ancillary issues pertaining to Flickr as it is today.
  • * Technical specifications of preserving Flickr: storage size, medium, formats, etc. and the rough costs of archiving when leveraging advances in archival storage, such as in-line block-level de-duplication, embedded metadata and non-relational data stores.
  • * Formulate a strategy for preservation that may be applied to other such sites of interest, integrating the foundational work of the OAIS model.