Digital Preservation: An Unsolved Problem

Return to main article:

Given the convenience and potential cost saving of digital delivery for both libraries and users, combined with the power digitization offers to search within texts, why not embrace the digital future now? The issue of preservation is one of the main obstacles.

Speaking from her experience as head of collection care for the British Library, Helen Shenton explains that “the greatest risks to printed material are the environment, wear and tear, security, and custodial neglect.” Facilities such as the Harvard Depository address most of those concerns, although wear and tear is an unavoidable consequence of use. On the digital side, on the other hand, use of the data is one of the best ways of preserving it, because “‘bit rot’ is one of the biggest risks.” A book left on the shelf for a hundred years might be fine, Shenton says, but digital data must be read and checked constantly to ensure their integrity. 

For digital preservationists, a prime concern is that data might be kept perfectly secure and complete, but still be unreadable by machines and programs in the future. A New Yorker cover depicting an alien, come to post-apocalyptic Earth, sitting amid the detritus of modern civilization--discarded CDs, tapes, and computers--illustrates the point: the alien is reading a book, the only thing that still “works.” “You have to think about moving the content along as technology changes,” explains Andrea Goethals, digital-preservation and repository-services manager. In order to make this feasible, librarians try to limit the number of file formats they make use of, and store detailed technical metadata with every object so that in 10 years or 100 “it can be rendered again in a usable way.” Every few years, as the programs that created a text file or a PDF become obsolete, librarians must ensure that the contents of those files remain readable by the current generation of computers and software. But opening each file manually in order to save it in a current format is not feasible when there are millions of them. “Because of the enormous amount of digital material we hold, migrating content is done at scale, not one file at a time,” Goethals explains. “We have to be able to do it on a whole class of objects”--all Microsoft Word files, for example. This content management strategy should work, “but because digital preservation is a young science, we don’t have a lot of experience with it yet.”

Objects that begin in an analog format--a book, a recording of a poetry reading, or a piece of music--are easiest to preserve digitally because librarians can choose the optimal file format for long-term access. Material that is born digital--e-mail, for example, which comes in many different, often proprietary, formats--is not so easy to preserve. This is proving a vexing problem for librarians charged with maintaining records of the University’s intellectual life. Harvard archivists, for example, must figure out what to do with the president’s e-mail--both a technical and legal challenge, because of privacy laws governing its handling. 

For medical librarians like Isaac Kohane, the lack of case notes from the last 20 years is an equally grave concern. He recalls his excitement, shortly after becoming director of the Countway, at being able to read the notebooks of a surgeon who was about to conduct the first successful cadaveric kidney transplant, for which he later won the Nobel Prize. “You could see the foresight and the hypothesis he had, as he made that leap. So it is stunning to realize that we are in a period of unprecedented lack of documentation of academic output.” Most notes, images, and communications today are stored as e-mail. “I’m worried that, 30 years hence, people will say, ‘Well, what was happening during the great genetics revolution at Harvard University?’ They’ll have no idea.”

Read more articles by: Jonathan Shaw

You might also like

More Housing in Allston

Toward another apartment complex on Harvard-owned land

General Counsel Diane Lopez to Retire

Stepping down after 30 years of University service

Navigating Changing Careers

Harvard researchers seek to empower individuals to steer their own careers.

Most popular

William Monroe Trotter

Brief life of a black radical: 1872-1934

Romare Bearden

Brief life of a textured artist: 1911-1988

A Real-World Response Paper

In Agyementi, Ghana, Sangu Delle ’10 brings clean water to a village.

More to explore

Illustration of a box containing a laid-off fossil fuel worker's office belongings

Preparing for the Energy Transition

Expect massive job losses in industries associated with fossil fuels. The time to get ready is now.

Apollonia Poilâne standing in front of rows of fresh-baked loaves at her family's flagship bakery

Her Bread and Butter

A third-generation French baker on legacy loaves and the "magic" of baking

Illustration that plays on the grade A+ and the term Ai

AI in the Academy

Generative AI can enhance teaching and learning but augurs a shift to oral forms of student assessment.