3.7 Administrative Metadata – Preservation Metadata

3.7.1     The information described in this section is part of the administrative metadata grouping. It resembles the header information in the audio file and encodes the necessary operating information. In this way the computer system recognises the file and how it is to be used by first associating the file extension with a particular type of software, and reading the coded information in the file header. This information must also be referenced in a separate file to facilitate management and aid in future access because file extensions are at best ambiguous indicators of the functionality of the file. The fields which describe this explicit information, including type and version, can be automatically acquired from the headers of the file and used to populate the fields of the metadata management system. If an operating system, now or in the future, does not include the ability to play a .wav file or read an .xml instance for example, then the software will be unable to recognise the file extension and will not be able to access the file or determine its type. By making this information explicit in a metadata record, we make it possible for future users to use the preservation management data and decode the information data. The standards being developed in AES-X098B which will be released by the Audio Engineering Society as AES57 “AES standard for audio metadata – audio object structures for preservation and restoration” codify this aspiration.

3.7.2     Format registries now exist, though are still under development, that will help to categorise and validate file formats as a pre-ingest task: PRONOM (online technical registry, including file formats, maintained by TNA (The National Archives, UK), which can be used in conjunction with another TNA tool DROID (Digital Record Object Identification – that performs automated batch identification of file formats and outputs metadata). From the U.S, Harvard University GDFR (Global Digital Format Registry) and JHOVE (JSTOR/Harvard Object Validation Environment identification, validation, and characterization of digital objects) offer comparable services in support of preservation metadata compilation. Accurate information about the file format is the key to successful long-term preservation.

3.7.3     Most important is that all aspects of preservation and transfer relating to audio files, including all technical parameters are carefully assessed and kept. This includes all subsequent measures carried out to safeguard the audio document in the course of its lifetime. Though much of the metadata discussed here can be safely populated at a later date the record of the creation of the digital audio file, and any changes to its content, must be created at the time the event occurs. This history metadata tracks the integrity of the audio item and, if using the BWF format, can be recorded as part of the file as coding history in the BEXT chunk. This information is a vital part of the PREMIS preservation metadata recommendations. Experience shows that computers are capable of producing copious amounts of technical data from the digitization process. This may need to be distilled in the metadata that is to be kept. Useful element sets are proposed in the interim set AudioMD (http://www.loc.gov/rr/mopic/avprot/audioMD_v8.xsd), an extension schema developed by Library of Congress, or the AES audioObject XML schema which at the time of writing is under review as a standard.

3.7.4     If digitising from legacy collections, these schemas are useful not only for describing the digital file, but also the physical original. Care needs to be taken to avoid ambiguity about which object is being described in the metadata: it will be necessary to describe the work, its original manifestation and subsequent digital versions but it is critical to be able to distinguish what is being described in each instance. PREMIS distinguishes the various components in the sequence of change by associating them with events, and linking the resultant metadata through time.