Digital Content: Preserving the preserved

When we talk about preservation in Digital Humanities or Cultural Heritage, we are usually refering to preserve real objects through Information and Communication Technologies (ICT). The concerns revolve around digitising documents, create 3D digitisations of artefacts or, even, reconstructing places through virtual environments or augmented reality.
In the last years, another concern related to preservation emerged: the preservation of the digital content created.
The fact is that digital content has to be maintained too and the constraints to access stored data in the future are related with hardware and software, that quickly changes and becomes obsolete.
As an example, the owner of a 5 1/4 inches new diskettes’ box could only think in using them to make an original case for his released music album. He couldn’t used them for it’s original purpose, because he doesn’t have the hardware to read them anymore:

CC License: Attribution - Non commercial - Share Alike - Click the picture to go to Nuno Nunes' photostream

If you didn’t recognize the previous picture, maybe you’ve used the following one, tagged as “vintage” and “retro”, nowadays. Even if you have one of this today, and unless you kept an old computer working, you’d find it very difficult to access to its contents.

CC License: Attribution - Non commercial - No Derivative Works Click the picture to go to Eduardo!'s photostream

Hardware and media devices become obsolete and there is not much we can do about it, unless migrating the content to newer hardware during the transition period.
Software has some constraints for preservation too. Probably you have already experienced some difficulties in sharing files and dealing with different files formats. But, in this case, maybe we can do something about it.

According to Donna Benjamim, in her article “ODF: Our Document Future”, presented at XTech Conference in 2006, there are “four elements of digital technology that threaten our future access to stored data:
1. The media on which it is stored.
2. The hardware used to design and create it.
3. The software used to create, read or edit that data and
4. The standards we rely on to record and format data.” [Benjamim, 2006]

Some strategies have been identified to address the digital preservation. Benjamim lists the technology perservation, which means to keep old technology working; emulation, which means to develop virtual machines to run old software and data; migration, that consists in transfer data to modern systems and convert it to standard file formats and encapsulation that involves storing the data with metadata.

These four strategies can be resumed in two: preserving the hardware environment and overcome obsolescence of file fomats. Benjamim believes that XML is the key to the last problem.

“XML can be created and interpreted independantly of any particular computer platform or program. It is an open standard that is widely accepted. It was developed with the aim of separating form, structure and content, and it is flexible and extensible. Most importantly perhaps, it is free and it is readable by both machines and human beings. Files can easily be converted into XML formats and XML is a useful way of recording and preserving metadata.” [Benjamin, 2006]

In what concern office documents, there is, since 2006, an open and international standard called OpenDocument. “OpenDocument uses XML to store and describe data. This means the document content can be read independently of the application that created it.” [Benjamin, 2006]

Because “the specification on how to read, and develop software that can read and write ODF files is completely open” [Benjamin, 2006], it will be possible to open an ODF (Open Document Format) document, in the future, even if the program that opens it does not exists anymore.
This is the problem of using closed and proprietary formats (in office documents, examples of this kind of formats are .doc, .xls, .ppt, etc), if the program that opens them disappears, it will be very difficult and sometimes too much expensive to open these documents.
Another issue in this kind of formats is interoperability, that is, the possibility to share your documents with other people that use other software to open those kind of documents or the possibility to you to export your data, to use it in another program.

Regarding preservation, the same principle of independence is defended by Cohen & Rosenweig (2005), in the book “Digital History”, about hardware and software.”Similarly, at the same time that you avoid getting entangled in specific, possibly ephemeral digital technologies, you should be as neutral as possible with regard to hardware and software platforms. Dependence on a particular piece of hardware or software is foolish because, like most hardware and software through history, the computer technology you depend on today will eventually disappear, more likely sooner rather than later.”

In Australia, several institutions are developing initiatives regarding digital preservation, like The National Library of Australia with the PADI (a repository of information on digital preservation) and PANDORA (Australia’s Web Archive Project) projects, The Public Record Office of Victoria with the VERS initiative (which specifies a standard format for electronic records), and the National Archives of Australia with Xena (converts digital documents from their original format into selected open, fully-documented formats used for archival preservation) [Benjamin, 2006].

If you want to try ODF format for your documents, you have several computer programs. The best known is OpenOffice.org, a multi-platform, free and open-source office suite, that creates documents in ODF, by default. You can save your documents in other formats too.

2 Responses to “Digital Content: Preserving the preserved”

  1. New post on Digital Preservation « paula simoes’ blog Says:

    [...] Posted in Education, History, Technology by paulasimoes on May 15, 2009 I wrote a post on EuroMACHS blog about digital preservation, a very important issue nowadays and not only to historians or digital humanities [...]


  2. Euromachs Blog » Blog Archive » Salman Rushdie’s computer emulation Says:

    [...] materials pose new preservation problems due to hardware obsolescence and software formats, as we have discussed before, that can be saved in recent formats. But if you are interested in primary sources, you will be [...]


Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>