Building a long-term archive for cultural data

Digitisation projects are making large amounts of data available to the public online. The DigiBern project, for example, is an online portal for information on the history and culture of the city and canton of Bern.

It was set up by the University Library Bern. Even in such an exemplary case, however, it has become clear that libraries face further tasks after a digitisation project is complete in order to ensure that the data remain accessible over the long term.

This is because the data are spread across different storage media with only a single, manual backup, and they must undergo quality checks to verify that they are readable and intact.

Long-term security and usability

The E-Rara project, a platform for digitised printed works from Swiss libraries, is a similar case. The University Library Bern has an obligation to archive the data for the long term but was previously unable to guarantee this.

The expansion of the institutional repository BORIS for the university’s publications also brought with it a need to find a solution to ensure that these data remain secure and usable for years to come.

With all this in mind, the library included the new challenge of long-term digital archiving for the first time in its strategy for the period from 2013 to 2016.

Outsource construction and operation?

In the planning phase, various options for implementing the digital archive were evaluated, one of the key questions being how feasible it would be to outsource as much of the construction and operation as possible.

The decision was eventually made to build an in-house archive one step at a time. With resources tight and the library having to acquire the necessary know-how as it went along, it was decided that a pilot archive would first be installed in an initial phase lasting three years.

Lengthy discussions followed regarding the requirements for a digital archive and potential solutions, and the implementation work began in 2015.

The library’s tasks following the digitisation projects as mentioned above led to two different mandates: SWITCH is tasked with providing a central platform for data storage, while Baden-based docuteam is responsible for ensuring the quality and long-term accessibility of the data.

Scalability is crucial

One of the most important requirements for the archive from the library’s point of view is that is must be geared to continuity and firmly embedded in its infrastructure and services. This guarantees that it will be reliable and sustainable.

Even though only a limited body of data is being archived in the early stages, the system should be scalable and able to cope with future demands such as curating research data.

Another goal is to secure and document the university’s research output so as to ensure that it can be used and quoted well into the future.

The University Library Bern’s digital archive is based on an internationally recognised reference model, the open archival information system (OAIS, ISO 14721).

The OAIS standard includes an information model specifying the technical data and metadata that need to be added to digital resources so that they can be kept and used over the long term.

It also includes a functional model outlining the technical and organisational tasks that must be performed for a digital archive. The standard was very helpful in designing the digital archive and is also serving as the basis for its technical implementation and future operation.

Minimising risks

The system uses open-source software from docuteam to collect and prepare data, together with the repository software Fedora Commons, which is also covered by an open-source licence. SWITCHengines is used to operate the necessary servers and for data storage.

In fact, it is this new infrastructure offering that made the step-by-step approach to constructing the archive possible in the first place.

Minimal resources are being used for testing at the moment, and the archive can be scaled up over the next few months without the library needing worry about infrastructure issues. This new, flexible SWITCH service also makes it possible to minimise risks in IT projects.

Text: Marion Prudlo

Marion Prudlo completed a Master’s degree in comparative literature at the University of Tübingen in 1996, followed by another in library and information science at the University of Pittsburgh. She worked as an electronic resources librarian in the US until 2005, when she became head of the E-Library Service Centre at the University of Bern.

The original article was published on 8. February, 2016 on the SWITCH website where it is also available in German, French and Italian.

Image: Preserved for posterity and digitised for online access: view of Bern’s lower old town from the year 1830. (© Swiss National Library)

Published: 03/2016

For more information please contact our contributor(s):