Team Digital Preservation

2023-10-26·

Digital Preservation Team

In June 2022 the IT Department established a team dedicated to preserving the National Library’s digital collection. This team handles all kinds of digital content, whether it’s digitized from physical sources or born digital. This includes media types like web pages, text documents, images, audio, and moving images.

The team’s responsibilities involve ingesting, checking, storing, preserving, and providing access to high-quality digital files. We work closely with several other specialized media teams in the library. In addition we are members of the Digital Preservation Coalition.

Organisation

The Digital Preservation team consist of 8 members:

Trond Teigen

Torbjørn Bakken Pedersen

Torbjørn Bakken Pedersen

Thomas Edvardsen

Thomas Edvardsen

Vigdis Marie Sørensen

Vigdis Marie Sørensen

Senior platform developer

Siarhei Kulakou

Siarhei Kulakou

Application developer

Johannes Karlsen

Johannes Karlsen

Application developer

Lise-Lotte Melkild

Lise-Lotte Melkild

Metadata specialist

Sandra Kråkstad

Sandra Kråkstad

Metadata specialist

This team reports to a committee of leaders responsible for this area in the National Library. The members are:

IT Director (Product owner)
Director of Digitalizing Cultural Heritage
Head of Metadata Standards Development Section
Head of IT Platform Section

The National Library’s digital collection in numbers

Over 2 billion files
More than 90 different file formats
15 Petabytes of data (that’s 15,000 Terabytes!) stored in 3 copies
The largest single file is 2.5 Terabytes
Daily ingest of new material averages over 4 Terabytes

Data volume by type

Video and television: 22%
Film: 21%
Newspapers: 19%
Web Archive: 16%
Radio and audio: 12%
Books: 8%
Photos: 2%

Technology choices used when working with digital preservation

Apache Kafka for sending messages between systems
Apache NiFi for running the data flows that validate, move, and package data
MariaDB as the database engine
DROID for identifying fileformats
Grafana for statistics and reporting
IBM High Performance Storage System (HPSS) as bit repository
GlusterFS for shared temporary storage
CentOS Linux as server platform

Last updated on 2024-12-06 - Github commit history ↗

NiFi S2S on Secured Instances