2: Key Digital Principles

2.1 Standards: It is integral to the preservation of audio that the formats, resolutions, carrier and technology systems selected adhere to internationally agreed standards appropriate to the intended archival purposes. Non-standard formats, resolutions and versions may not in the future be included in the preservation pathways that will enable long term access and future format migration.

2.2 Sampling Rate: The sampling rate fixes the maximum limit on frequency response.When producing digital copies of analogue material IASA recommends a minimum sampling rate of 48 kHz for any material. However, higher sampling rates are readily available and may be advantageous for many content types. Although the higher sampling rates encode audio outside of the human hearing range, the net effect of higher sampling rate and conversion technology improves the audio quality within the ideal range of human hearing. The unintended and undesirable artefacts in a recording are also part of the sound document, whether they were inherent in the manufacture of the recording or have been subsequently added to the original signal by wear, mishandling or poor storage. Both must be preserved with utmost accuracy. For certain signals and some types of noise, sampling rates in excess of 48 kHz may be advantageous. IASA recommends 96 kHz as a higher sampling rate, though this is intended only as a guide, not an upper limit; however, for most general audio materials the sampling rates described should be adequate. For audio digital-original items, the sampling rate of the storage technology should equal that of the original item.

2.3 Bit Depth: The bit depth fixes the dynamic range of an encoded audio event or item. 24 bit audio theoretically encodes a dynamic range that approaches physical limits of listening, though in reality the technical limits of the system is slightly less. 16 bit audio, the CD standard, may be inadequate to capture the dynamic range of many types of material, especially where high level transients are encoded such as the transfer of damaged discs. IASA recommends an encoding rate of at least 24 bit to capture all analogue materials. For audio digital-original items, the bit depth of the storage technology should at least equal that of the original item. It is important that care is taken in recording to ensure that the transfer process takes advantage of the full dynamic range.

2.4 Analogue to Digital Converters (A/D)

2.4.1 In converting analogue audio to a digital data stream, the A/D should not colour the audio or add any extra noise. It is the most critical component in the digital preservation pathway. In practice, the A/D converter incorporated in a computer’s sound card can not meet the specifications required due to low cost circuitry and the inherent electrical noise in a computer. IASA recommends the use of discrete (stand alone) A/D converters connected via an AES/EBU or S/PDIF interface, IEEE1394 bus-connected (firewire) discrete A/D converters or USB serial interface-connected discrete A/D converters that will convert audio from analogue to digital in accordance with the following specification. All specifications are measured at the digital output of the A/D converter, and are in accordance with Audio Engineering Society standard AES 17-1998 (r2004), IEC 61606-3, and associated standards as identified.

2.4.1.1 Total Harmonic Distortion + Noise (THD+N)
With signal 997 Hz at -1 dB FS, the A/D converter THD+N will be less than -105 dB unweighted, -107 dB A-weighted, 20 Hz to 20 kHz bandwidth limited.
With signal 997 Hz at -20 dB FS, the A/D converter THD+N will be less than -95 dB unweighted, -97 dB A-weighted, 20 Hz to 20 kHz bandwidth limited.

2.4.1.2. Dynamic Range (Signal to Noise)
The A/D converter will have a dynamic range of not less than 115 dB unweighted, 117 dB A-weighted. (Measured as THD+N relative to 0 dB FS, bandwidth limited 20 Hz to 20 kHz, stimulus signal 997 Hz at -60 dB FS).

2.4.1.3. Frequency Response
For an A/D sampling frequency of 48 kHz, the measured frequency response will be better than ± 0.1 dB for the range 20 Hz to 20 kHz.
For an A/D sampling frequency of 96 kHz, the measured frequency response will be better than ± 0.1 dB for the range 20Hz to 20 kHz, and ± 0.3 dB for the range 20 kHz to 40 kHz.
For an A/D sampling frequency of 192 kHz, the frequency response will be better than ± 0.1 dB for the range 20Hz to 20 kHz, and ± 0.3 dB from 20 kHz to 50 kHz (reference audio signal = 997 Hz, amplitude -20 dB FS).

2.4.1.4 Intermodulation Distortion IMD (SMPTE/DIN/AES17)
The A/D converter IMD will not exceed -90 dB. (AES17/SMPTE/DIN twin-tone test sequences, combined tones equivalent to a single sine wave at full scale amplitude).

2.4.1.5 Amplitude Linearity
The A/D converter will exhibit amplitude gain linearity of ± 0.5 dB within the range -120 dB FS to 0 dB FS. (997 Hz sinusoidal stimuli).

2.4.1.6 Spurious Aharmonic Signals
Better than -130 dB FS with stimulus signal 997 Hz at -1 dBFS

2.4.1.7 Internal Sample Clock Accuracy
For an A/D converter synchronised to its internal sample clock, frequency accuracy of the clock measured at the digital stream output will be better than ±25 ppm.

2.4.1.8 Jitter
Interface jitter measured at A/D output <5ns.

2.4.1.9 External Synchronisation
Where the A/D converter sample clock will be synchronised to an external reference signal, the A/D converter must react transparently to incoming sample rate variations ± 0.2% of the nominal sample rate. The external synchronistation circuit must reject incoming jitter so that the synchronised sample rate clock is free from artefacts and disturbances.

2.4.2 IEE1394 and USB Audio Interfaces. Many A/D converters now provide the facilities to directly interface to a host computer via the high speed IEEE1394 (firewire) and USB 2.0 serial interfaces. Both systems are successfully implemented as audio transmission interfaces across the major personal computer platforms, and can reduce the requirement to install a specialised, high-quality soundcard interface in the computer chassis. Audio quality is generally independent of the bus technology in use.

2.4.3 Selection of A/D Converters: The A/D converter is the most critical piece of technology in the digital preservation pathway.When choosing a convertor, and before any further evaluation is undertaken, IASA recommends that all specifications are tested against the reference standards described above. Any converter which does not meet the basic IASA technical specifications will produce less than accurate conversions. In conjunction with technical evaluation, statistically valid blind listening tests should be carried out on short listed converters to determine overall suitability and performance. All the specifications and testing described above are stringent and complex, and these specifications are highly important in selecting and evaluating analogue to digital convertors. The published specifications from the equipment manufacturers are sometimes challenging to compare, often incomplete and occasionally difficult to reconcile with the performance of the device they purport to represent. It may suit certain communities or groups to undertake group or panel testing to maximise resources. Certain institutions, such as state archives, libraries or academic science departments may be in a position to assist with testing.

2.5 Sound Cards: The sound card used in a computer for the purposes of audio preservation should have a reliable digital input with a high quality digital audio stream synchronisation mechanism, and pass a digital audio data stream without change or alteration. As a discrete (stand alone) A/D converter must be used, the primary purpose of a sound card in audio preservation is in passing a digital signal to the computer data bus, though it may also be used for returning the converted signal to analogue for monitoring purposes. Care should be taken in choosing a card that accepts the appropriate sampling and bit rates, and does not inject noise or other extraneous artefacts. IASA recommends the use of a high quality sound card that meets the following specification:

2.5.1 Sample rate support: 32 kHz to 192 kHz +/- 5%.

2.5.2 Digital audio quantisation: 16-24 bits.

2.5.3 Varispeed: automatic by incoming audio or wordclock.

2.5.4 Synchronisation: internal clock, wordclock, digital audio input.

2.5.5 Audio interface: high speed AES/EBU conforming to AES3 specifications.

2.5.6 Jitter acceptance and signal recovery on inputs up to 100ns without error.

2.5.7 Digital audio subcode pass-through.

2.5.8 Optional timecode inputs.

2.6 Computer Based Systems and Processing Software: Recent generations of computers have sufficient power to manipulate large audio files. Once in the digital domain, the integrity of the audio files should be maintained. As noted above the critical points in the preservation process are converting the analogue audio to digital (which relies on the A/D converter), and entering the data into the system, either through the sound card or other data port. However, some systems truncate the word length of an item in order to process it, resulting in a lower effective bit rate and others may only process compressed file formats, such as MP3, neither of which is acceptable. IASA recommends that a professional audio computer based system be used whose processing word length exceeds that of the file (i.e. greater than 24 bit) and which does not alter the file format.

2.7 Data Reduction: It has become generally accepted in audio archiving that when selecting a digital target format, formats employing data reduction (frequently mistakenly called data “compression”) based on perceptual coding (lossy codecs) must not be used. Transfers employing such data reduction mean that parts of the primary information are irretrievably lost. The results of such data reduction may sound identical or very similar to the unreduced (linear) signal, at least for the first generation, but the further use of the data reduced signal will be severely restricted and its archival integrity has been compromised.

2.8 File Formats

2.8.1 There are a number of linear audio file formats that may be used to encode audio, however, the wider the acceptance and use of the format in a professional audio environment, the greater the likelihood of long term acceptance of the format, and the greater the probability of professional tools being developed to migrate the format to future file formats when that becomes necessary. Because of the simplicity and ubiquity of linear Pulse Code Modulation (PCM) [interleaved for stereo] IASA recommends the use of WAVE, (file extension .wav) developed by Microsoft and IBM as an extension from the Resource Interchange File Format (RIFF). Wave files are widely used in the professional audio industry.

2.8.2 BWF .wav files [EBU Tech 3285] are an extension of .wav and are supported by most recent audio technology. The benefit of BWF for both archiving and production uses is that metadata can be incorporated into the headers which are part of the file. In most basic exchange and archiving scenarios this is advantageous; however, the fixed nature of the embedded information may become a liability in large and sophisticated data management systems (see discussion chapter 3 Metadata and Ch 7 Small Scale Approaches to Digital Storage Systems). This, and other limitations with BWF, can be managed by using only a minimal set of data within BWF and maintaining other data with external data management systems. AES31-2-2006, the AES standard on “Network and file transfer of audio - Audio-file transfer and exchange - File format for transferring digital audio data between systems of different type and manufacture” is largely compatible with the standard set in BWF, and its is expected that future development in the area will continue to make the format viable. The BWF format is widely accepted by the archiving community and with the limitations described in mind IASA recommends the use of BWF .wav files [EBU Tech 3285] for archival purposes.

2.8.3 Multitrack audio and film or video soundtracks, or large audio files, may use RF64 [EBU Tech 3306], which is compatible with BWF,AES-31 or as a wav file in an Media Exchange Format (MXF) wrapper. As these are all still under development, one pragmatic approach may be to create multiple time coherent mono BWF files wrapped in the tar (tape archive) format.

2.9 Audio Path: The combination of reproduction equipment, signal cables, mixers and other audio processing equipment should have specifications that equal or exceed that of digital audio at the specified sampling rate and bit depth. The replay equipment, audio path, target format and standards must exceed that of the original carrier.