Print

Print


Why?  Why shouldn't URLs persist?  Do LC shelf numbers for books change?
 Does the street address of your house or business change?

Back in 2001 I wrote an article for Library Journal and I listed a number of
reasons why digital content disappears.  Changing all your site's URLs just
because you have a new director or you've chosen a new technology platform
makes no sense to me.


Modes of digital death
The changing of the political guard is but a single cause of such
disappearances. We can identify different modes of digital death:

   - *The New Replaces the Old: *Change needn't be as dramatic as the
   transition of power of the U.S. presidency. Every time an organization has
   new information to put on a web site or CD-ROM or other digital format,
   there is a tendency to publish the new content and simply to overwrite or
   toss the old.
   - *Content Reorganization: *Particularly in the web sphere, it is popular
   to reorganize content space periodically: when a new master is assigned to
   care for the content, when significant change in the content structure has
   occurred, or when people deem the content organizations "stale." Many an
   Error 404 stems from simple reorganization.
   - *Death of a Sponsor: *When the sponsoring organization of a collection
   dies, so too may its digital content. The day the Al Gore campaign conceded
   the 2000 election, its web site went dark. Anyone hoping to analyze
   documents on that site was instantly deprived of access.
   - *Sponsor Loses Interest:* Most traditional print publishers consider
   the "back file" to be an asset worth protecting. Web-based publishers seem
   to lack this long-term view. For instance, Internet World, a print
   publication with a companion web site, has published since 1994. Until
   recently, a complete archive of back issues appeared on the web site. Now,
   the archive extends back only to July 1999.
   - *Sponsor Fears History:* In many cases, corporations may consciously
   avoid maintaining historical documents for fear of litigation. Ford, IBM,
   and other major companies have been sued in recent years for infractions
   alleged to have taken place over 50 years ago. Corporate document retention
   policies therefore tend to encourage disposal of documents. As corporate
   knowledge moves to the company intranet, corporations will face an
   increasing dilemma: to what extent are we protecting ourselves from
   potential litigation, and to what extent are we destroying corporate
   knowledge?
   - *Lost Functionality: *The disappearance of www.pub.whitehouse.gov, an
   MIT-developed site that offered rich search functionality, exemplifies this
   particular form of digital death. While NARA has preserved the raw content,
   the NARA search interface pales in comparison to the MIT product. Similarly,
   in February 2001, *Deja.com* <http://www.deja.com/>(formerly
   DejaNews.com), which provided a searchable archive of postings to Usenet
   news groups, was taken over by Google. While Google acquired all of the
   "intellectual assets" of Deja.com, it failed to preserve the search
   interface, breaking thousands of hyperlinks and making searches impossible
   for savvy Deja users. Google promises to restore the lost functionality in
   time; other takeovers may not be so sensitive to user demands.
   - *Media Format Obsolescence:* Anyone who owns data stranded on a 51/4"
   floppy disk knows the impact of media format changes. A newer storage
   technology supplants your format of choice; unless you take steps to copy
   old content to newer media, your data become stranded.
   - *Content Format Obsolescence: *Data stored in a proprietary format,
   such as an older version of WordPerfect or an obsolete tool such as PC
   Write, may be completely unreadable to current versions of software. Even
   web content could conceivably become obsolete, as the transition to newer
   generations of HTML or XML (or popular proprietary formats such as Flash)
   render old content unusable with newer software.
   - *Disaster: *Whether a small-scale disaster (e.g., server meltdown) or a
   disaster that affects a campus (e.g., the 1994 earthquake that destroyed
   much of Cal State-Northridge), disaster can wipe out digital data whose
   sponsors have failed to provide for adequate offsite backup.