Challenges for Future Archivists

By Erica Stiner

Key Challenges and Remedies for Future Access:

    • File formats
      • Use lossless, widely adopted, non-proprietary formats
      • Save artifacts in multiple ways
    • Maintaining file integrity
      • Maintain backup of files
      • Select a hosting service with user-friendly service agreement
    • Web 2.0 tools are dynamic
      • Save artifacts in multiple ways
      • Describe artifacts in plain text - the most accessible option possible
    • Hosting service unreliable
      • Select service with user-friendly policies
      • Prepared to move artifacts to new hosting service

Accessing the web 2.0 tools available today will be impossible in the future unless archivists are diligent in saving and maintaining the files.

Digital preservation presents some unique challenges. The rate of technological change quickly render file formats, storage media (e.g. floppy disks) and access media (e.g. computers) obsolete. Operating systems likewise become outdated. While digitalization was at first lauded as an inexpensive means of preservation, research indicates the costs and even viability may be more of an obstacle than originally envisioned. As stated in the CCSDS OAIS (Open Archival Information System ) Blue Book (2002), "preserving information in digital forms is much more difficult than preserving information in forms such as paper or film" and institutions that had never before considered archiving now must have an active role if information is to be preserved.[1] As the OAIS model indicates, digital information will stand the best chance of being maintainable if archival actions are taken from the moment of creation. Metadata is needed for both preservation and enduring access, though exactly what this metadata should entail is still a matter of discussion and research. [2]

Undoubtedly, not everything posted to the internet should be preserved, but a good sample that reflects both the depth and breadth of tools being used today for living healthier lives should be saved. The challenges to preservation are many and interconnected. The Pursuit of Healthiness 2009 time capsule faces three constraints to implementing the best-practices in digital preservation that could overcome preservation challenges: insufficient technology to preserve the digital objects, no funding, and minimal technical knowledge. We have no control over what Wiki Spaces does with their systems since we have a free account. Digital media need to be replaced all too frequently both to prevent data loss and update systems that would otherwise become obsolete. Copying data from one device to another risks data loss. Running check sums is nearly pointless unless one has the data backed up to replace corrupted files or the knowledge to repair the files. Anyone can create an account and load data into the wiki system, making it vulnerable to intentional and accidental infection from viruses. Wiki Spaces could alter or delete our account at any time or cease to operate. There is no promise to update systems as technology changes. At best, we would be notified of any change under way with enough time to reassemble the wiki using a new service provider. However, the benefit of digital artifacts remains. We have multiple copies of the artifacts, though not the wiki access and presentation medium, in our personal computers. The presentation context would be lost but access to the artifacts could be recreated. Perhaps this is appropriate; as society changes, needs and ideas of what an exhibition should be change.

Lack of funding is a very common problem, affecting what systems are used, what expertise is available to devise and carryout preservation strategies, and the selection of artifacts to preserve. Some preservation activities are more costly, in both time and money, than others. File formats that are popularly used, have source code openly available, are operating-system independent, and are functionally less dynamic are easier to preserve, but need to be balanced with selection of formats that present the artifact well (e.g. color quality, functionality, etc.). If many people use the format, the likelihood increases that someone, or the format owner, will create a means of moving the format forward as technology changes. Open source code, unlike proprietary code, allows people to create migration or other strategies. However, Open source can mean less support and more work for the user to make the software fit his needs. Other open source software is well supported and has a robust user community; just as with proprietary software, the archivist must choose carefully. Some formats, such as Windows Media Video, are popular enough that it is almost certain someone will create means of enduring access but converting to uncompressed Motion JPEG 2000 or MPEG could still be a better choice.

Web 2.0 tools tend to be dynamic, which means one artifact could consist of multiple fie formats, all of which could need to be preserved individually as well as the user interface. Preservation of static text or images would be easier, hence less costly, but the interactivity is the heart of what we are trying to preserve. [3] We chose to utilize the software we already owned or tools available freely online. If we had the funds, we could store the time capsule in a commercial repository with data backups and contract for the services that would increase the likelihood the artifacts are available in the future. Still, file format alone is not reason enough to exclude an exceptional artifact from inclusion in the time capsule, especially since digital preservation is in its infancy and future developments might make the format accessible for the long-term.

Unfortunately, like many cultural institutions, we lack technical expertise to build a robust repository for our selected artifacts and carryout technically complex activities such as migration or emulation. We are taking a mixed approach, anticipating at least some representation of the artifact will endure. First, where possible, we converted file to a loss-less format preferred by the Library of Congress, which would ideally be validated using JHOVE . We tried to avoid acquiring files limited by digital rights management mechanisms that would prevent saving the file to our local system without restrictions. We usually took a screen shot of the interface showing the artifact so the visual context will be preserved. For interactive artifacts like maps, we tried to record the desktop interface as we demonstrated typical use of the artifact. Finally, we provided descriptive metadata both to provide context for the wiki visitor and documentation for future preservation. Our descriptive text can be backed up as pdf. Finally, we included a link to the original website that hosted the artifact and/or the website homepage which is more enduring than the subpage artifacts typically reside on. The file formats on the wiki are oriented toward access in the present, in keeping with the OAIS model. The formats used for the wiki may be compressed or lossy because that will load faster on current systems, which is desirable for users. The preserved format is in keeping with the mixed method described above on the Pursuit of Healthiness team members' individual systems.

The future archivist will need to maintain the files. Current strategies include migration and emulation. According to the OAIS model, which has become the standard for digital repositories, there are four types of migration. Transformation attempts to maintain the information but changes the content in some way. Repackaging changes the bits of the object's packaging. Replication copies the information packet onto a different storage media. Refreshment, a form of replication, involves making an exact copy at the bit level of an archival information packet onto a new storage media of the same type so no information architecture mapping is required. At this time, emulation is largely too costly and difficult to be under taken usually, though research currently underway may change the situation in the future. Emulation would recreate the application, and perhaps the underlying operating system, that can render a preserved file so the file does not need to be altered.[4] Ideally, someone will develop a "universal machine" that will be able to render any file and preservation will not be so much of a challenge in the future and access will be assured. In the meantime, the Pursuit of Healthiness time capsule preservation strategy should reasonably ensure access for a time.

  1. ^ Consultative Committee for Space Data Systems. (January 2002). Reference Model for an Open Archival Information System. CCSDS 650.0-B-1. Blue Book. Issue 1. p. 1-3.
  2. ^ Lavoie, Brian F. (Summer 2004). "Preservation Metadata: Challenge, Collaboration, and Consensus" in Microform & Imaging Review 33(3), p. 130-4.
  3. ^ Arms, Caroline & Fleischhauer, Carol. (2005). "Digital Formats: Factors for Sustainability, Functionality and Quality." Paper presented at IS&T Archiving 2005 Conference, Washington, DC. Available at:
  4. ^ Consultative Committee for Space Data Systems. (January 2002). Reference Model for an Open Archival Information System. CCSDS 650.0-B-1. Blue Book. Issue 1. Section 5.