The journey from a records management system to a digital preservation system

“People have had a lot of trouble getting stuff out of RecordPoint.”

This sentence was a little worrying to hear. It was 2015, and our archive was contemplating digital preservation for the first time. We didn’t really know what it was, or how it worked. Neither did anyone else: the idea of having a “digital preservation system” received blank stares around the office. “Is it like a database? Why not use one of our CMS’s instead? Why do we need this?”

And so it was that I realised I was in over my head and needed outside help. I looked up state records offices to find out what they were doing, and realised there is such a thing as the job title “Digital Preservation Officer”. I contacted one of these “Digital Preservation Officers” to get on the right path.

The Digital Preservation Officer’s knowledge in that early conversation was invaluable, and helped us get over those early hurdles. She explained the basics: why digital preservation is important for an archive. How to get started. Breaking down jargon. Convincing non-archivists that yes, it is necessary. And – the importance of figuring out what you want to preserve.

“We will need to preserve digital donations,” I listed, “and digitizations of our physical inventory. Plus, I manage our digital records management system, RecordPoint – if we’re serious about our permanent records we will need to preserve those as well.” (The international digital records management system standard, ISO 16175 Part 2, says that “long-term preservation of digital records… should be addressed separately within a dedicated framework for digital preservation or ‘digital archiving’”.)

It was at this point that the Digital Preservation Officer replied with the quote that began this article.

I don’t think she was quite right – getting digital objects and metadata out of RecordPoint was quite easy. The challenge, it turned out, would be getting the exported digital objects into our digital preservation system, Archivematica.

In the image shown below, the folders on the left represent the top level of a RecordPoint export of two digital objects. The folders on the right are what Archivematica expects in a transfer package.

In the example above, there are three folders for ‘binaries’ (digital objects) and two folders for ‘records’ (metadata). Immediately something doesn’t make sense – why are there three binary folders for two objects?

The reason is that the export includes not only the final version of the digital object but also all previous drafts. In my example there is only a single draft, but if a digital object had 100 drafts, they would all be included here. This is great for compliance, but not so great for digital preservation where careful appraisal is necessary. The priority when doing an ‘extract, transform, load’ (ETL) from RecordPoint to Archivematica would be to ensure that the final version of each binary made it across to the ‘objects’ folder on the right.

An Archivematica transfer package should not only consist of digital objects themselves, of course – you are not truly preserving digital objects unless you also preserve their descriptive metadata. This is why the ‘metadata’ folder on the right exists: you can optionally create a single CSV file, ‘metadata.csv’, which contains the metadata for every digital object in the submission as a separate line. Archivematica uses this CSV file as part of its metadata preservation process.

In contrast, RecordPoint creates a metadata file for every one of the digital objects it exports. If you wanted to pull metadata across into the metadata CSV file for the Archivematica submission, you would need to go through every single metadata XML in the export and copy and paste each individual metadata element. Based on a test, sorting the final record from the drafts and preparing its metadata for Archivematica might take two to four minutes per record. Assuming we have 70,000 records requiring preservation, the entire process of transforming these records manually would take over 6,000 hours. Although technically possible, this is too much work to be achievable, and there would be a high likelihood of errors due to the tedious, detail-oriented work.

Fortunately, I knew the R programming language. R is used by statisticians to solve data transformation problems – and this was a data transformation problem! I created an application using a tool called R Shiny, providing a graphical interface that sits on the Archivematica server. I creatively called it RecordPoint Export to Archivematica Transfer (RPEAT). After running a RecordPoint export, you select the export to be transformed from a drop-down list in RPEAT and select the metadata to be included from a checklist. RPEAT then copies the final version of each digital object from the export into an ‘objects’ folder and trawls through each XML file to extract the required metadata. Finally, RPEAT creates a CSV file that contains all of the required metadata, and moves it into the ‘metadata’ folder. Everything is then ready for transfer into Archivematica.

Pushing 212 records exported from RecordPoint through RPEAT, selecting the correct metadata from the checklist, and doing some quick human quality assurance took 7 minutes. Scaled up, transforming all 70,000 records this way would take fewer than 39 hours. RPEAT reduces the time taken to prepare records for Archivematica by over 99% compared to manual processes.

The advice that the Digital Preservation Officer provided all those years ago was invaluable, and I think in particular the warning on “getting stuff out of RecordPoint” was pertinent – but I wish to expand on her point. The challenge is not unique to RecordPoint – the challenge is ETL in general. At a meeting of Australia and New Zealand’s digital preservation community of practice, Australasia Preserves, in early 2019, other archivists shared their struggle to do ETL from records management systems into their digital archive. This ability is an important addition to the growing suite of technical skills valuable to us digital preservation practitioners.


Archive in Practice: An imagined exhibition

 Part one: Archive in Practice

One dimension of photography is that it is concerned with the staging of a struggle against the loss of memory – an attempt to archive and preserve what is about to disappear for good.[1] Gerhard Richter

These reflections by artist Gerhard Richter, encapsulate the very reason why I was lured into working with the medium. “An attempt to archive and preserve what is about to disappear for good”… Photography frames subjective experience in time. The act of taking a photograph is a highly romantic gesture – it captures a frame in time, which then becomes a fragment, isolated from its whole.

(Including the black edges of the film strip and a sliver of the next photograph on the film, amplifies the notion of the fragment)

3.2 The View. 2006 [2011]. Archival Inkjet Print on Hahnemühle Photo Rag. 77 x 46cm. Edition of 3 + 2 AP. (Including the black edges of the film strip and a sliver of the next photograph on the film, amplifies the notion of the fragment)

Every single photograph that I have ever taken contributes to an organically growing archive of irretrievable past defined in pictorial representation. This archive is the foundation of my art practice whereby the images within it become subject to constant reinterpretation and reconfiguration.

Acts of Recall, [sort excerpt], 2015, 16:9, colour, 14 min, 36 seconds, video still

By continually retrieving earlier photographs and combining them with more recent pictures, I explore new sets of formal connections and narrative relationships, which then surfaces other imagery or elements. In this way, my work reflects upon the transitory nature of meaning and memory, thereby amplifying the paradox of photography.

Drifting Down, [The Dome], 2012. Archival Inkjet Print on Hahnem Archival Inkjet Print on Hahnemühle Photo Rag. 100x100cm. Edition of 5 + 2 Ap

I am working with an acutely active archive, one that is constantly expanding physically as I continue to take pictures using analogue film in combination with digital printing processes. However, it is the emotional impact of each of these pictures that cause my archive to function and how they evoke and interact with my own memory. The enduring questions are:

How does one preserve content in an archive that is driven by “the felt”, the narrative and the poetic?

 How does one organise and manage the content in an archive that is continually changing in meaning and has endless manifestations, inter-relationships and formal and narrative connections?

All The Gardens I Could Find – Installation View
Blindside, Melbourne, 2016.

 I explore these themes through projects and exhibitions. Through the use of installation strategies I create pictorial and spatial structures that often function as a visual and temporal representation of the archival process and the concept of the catalogue as a completed physical item.

I playfully present photographs from my archive as a composite experience across a gallery space, thematically arranged, described and in constant dialogue with one another. This is realized through using colour, components of text and careful placement of the works in relation to the architecture of the gallery space. I usually include mechanisms for storing, reimaging or archiving like boxes, tables, folders, envelopes, and frames as a way of suggesting that the order is not fixed, and that the material is always in a state of being sorted through and processed – meaning is always in a constant state of flux.

Series 5: Overlaps – Garden Green and Sky Blue. Installation View Detail. Blindside, Melbourne 2016.

  For When All the Leaves Will Fall (Chiang Mai, Thailand) 2016 (2015)
  Archival Inkjet Print on Hahnemühle Photo Rag. 54x37cm.
  Edition of 5 + 2 Ap.

Through working in collection institutions like the State Library of Victoria and the British Library as my day job, I have been exposed to institutional workflows and archival tools and processes used to manage and preserve collection material and to make it accessible and discoverable for users. I have been inspired by the principles of archival arrangement and description and systems used to store, display and handle collections. This day-to-day engagement has undoubtedly woven into my own methodology.

The second half of this article for Archivoz, takes on the form of an imagined exhibition where archival tools and principles are employed to organize and display the works, as well as to amplify readings concerned with the fragmentary. The concept of the archive is also used as a metaphor for representation of the inner workings of the mind.

The Course of Leaving [Of course I will be Leaving]. 2010.
Archival Inkjet Print on Hahnemühle Photo Rag Baryta. 60x40cm


Part two: An Imagined Exhibition

A single table is positioned across the centre of the gallery, causing the room to be split into two parts. The dimension of this table permits only just enough space for the viewer to move around it and access the other half of the gallery.

The table’s surface acts as a carrier of meaning. Upon this surface, lay fragments of images – unmounted, unframed and resting in piles, that seem to be assembled into groupings according to colour, pictorial content and geometric forms. The surface layer of pictorial content is presented to the viewer, while the photographs embedded underneath are concealed by the nature of the pile. These deeper layers suggest a personal content that is not accessible.

For Proust, the deepest most profound memories really need to have been “lost” by being gradually covered over by other memories…[2]…. Embedded underneath the surface layers of the pile are ‘the true emotional tone of the past

The viewer enters the space through the whiteness and emptiness, being lured toward the zone of the table by the fragments of deep and vivid colors revealed between sheets of creamy white paper and manila folders that evoke the sense of residue that has accumulated over the years. In this structure, the pile is a metaphor for The Ruin and one of the beauties of a Ruin is its ability to be re-constructed.

The space in the back half of the gallery (behind the table) is roused by activity – large scale photographs, evocative and contemplative are assembled onto the whiteness of the walls – activating them with colour, light and image.

“Archives are seen as rows and rows of boxes on shelves, impenetrable without the codex which unlocks their arrangement and location”[3]. In this pictorial structure, it is as though the contents of the archive have emerged from their boxes and folders in storage and are undergoing a process of renewal, construction and re-construction.

A code of access is provided for the viewer, through the visual dialogue that operates between the piles of information laid out on the table and the photographs on the walls. Memory is used here as a device: through the use of installation strategies like repetition, groupings, rhythms, contrasts in scale – the viewers’ own memory can be evoked.

As the viewer passes through the area around the table to access the back half of the gallery, they will encounter a Finding Aid, which invites them to go deeper into uncovering further layers of content, through the descriptive information listed at item level.

The archive presented here is fluid, flowing, and its content discoverable through the act of slowing down and paying attention to the subtle codes revealed visually through the careful placement of works throughout the space.

 A Room for Ordering Memory. 2012. Installation View. Counihan Gallery, Melbourne.

For futher information: 

[1] Gerhard R. (2010). Between Translation and Invention: The Photograph in Deconstruction. In Copy, Archive, Signature: A Conversation on Photography. Stanford, California: Stanford University Press.

[2] Gross, D. (2000). Lost Time – On Remembering and Forgetting in Late Modern Culture. Amherst, Massachusetts: University of Massachusetts Press.

 [3] Breakwell, S. (Spring 2008). Perspectives: Negotiating the Archive. In Tate Papers 9. Retrieved from