7.7 Practical Hardware Arrangements

7.7.1.1  The following information describes how a practical system might be implemented. As has already been discussed above, the assumption is that all of the audio archival data will be stored on hard drive and all of the audio archival data will also be mirrored on data tape such as LTO.

7.7.2 Hard Disk drives

7.7.2.1  A common and affordable approach to data storage on disk is to connect to a cluster of HDDs (hard disk drive) arranged in a RAID array (see section 6.3.14 Hard Disk Drives). RAID level 1 is little more than two drives mirrored; keeping two copies of the data on different physical hardware; if one disk fails it is available on the other drive. Higher level RAID arrays (2 to 5) implement increasingly complex systems of data redundancy and parity checking that ensures the data integrity is maintained. The higher level RAID arrays achieve the same level of security as level 1, or mirroring, but with significantly less storage space. RAID 5, for example, may have a 25% storage loss (or less depending on implementation), when compared to 50% for RAID 1. Sophisticated arrays are widely available.

7.7.3 Tape Backup

7.7.3.1  No single component of a digital system can be considered reliable, instead the reliability of the system is achieved through multiple redundant copies at every stage. The final and most important component in the storage chain is the data tape. In the recent past LTO has gained popularity for this purpose (see section 6.3.12 Selection and Monitoring of Data Tape Media), however other data tape formats may be appropriate depending on the particular circumstance.

7.7.3.2  All data on disk storage should be duplicated on a suitable storage tape. A minimum of two sets of data tapes must be produced, to be stored physically in different places. As it is not unusual for the second set of tapes to be required in the restoration of the data many established archives make three sets of copies, two to be kept near the system for ease of access and a third set stored remotely to protect against physical disasters. It has become customary that the separate sets of data tapes should be made using different products of which a considerable amount of the same batches are bought at one time. This renders quality control and rescue measures easier, once a batch of a given product should fail. Appropriate volume management software will aid in the back up and retrieval process especially if the system incorporates a number of storage devices.

7.7.3.3  Error checking is difficult to implement in open source and low tech solutions because that capability is linked to specific hardware. Nonetheless, a low-tech possible alternative to proper error testing is described in the following paragraph. The data management software has a catalogue (with a printer attached). The hard disc (in RAID) contains a complete set of data. All data is copied onto identical tape copies. There are at least two copies. As data is copied onto a tape, a unique identifier is printed onto a label (human readable) which is attached to the tape. The same identifier can be recorded onto the header of the tape. The data management system can be scripted to prompt the user to find and insert the tape identified by the system. Rather than checking the tape for errors, the system will verify the content of the tape against the hard disc. The hard disc can check the veracity of its own data content and is aware of any failings itself. If the verification of the tape fails, the system can produce a new tape from the hard disc. Assuming 20 terabytes of storage, the system would verify two tapes a day, every tape and its duplicate can be verified three times per year. In the event of a disc failure requiring the data tapes to replace it, there will be two tapes which have been checked within the previous four months. The risk that both tapes and the hard disc would fail is very low.

7.7.4 Single (or Double) Operator Storage System

7.7.4.1  The simplest archival storage system would be to attach a separate RAID array containing only the audio data to the primary DAW (digital audio workstation). This configuration is only possible for institutions with one operator in the digitising process. A requirement for the success of this approach is a well structured plan for digitisation and a dedicated disk array so as the work can be carried out continuously without major interruptions. This will ensure that the HDD attached to the DAW continuously copies to tape whenever the amount of data to fill the target medium is reached.

7.7.4.2  If two operators and workstations are undertaking the digitisation tasks it will be necessary to provide access to a shared drive or drives. The sharing of such resources can be achieved by defining one of the computers as the server, and configuring it so that it manages the drives, and implementing a single wire sharing capability. Such an approach is relatively easy to implement and allows sharing between two operators, though it requires some procedural agreements to avoid conflicts. Logical organisation of data and strict naming procedures are a necessity of small scale manual storage systems.

7.7.4.3  If a system were established of the size described here. It might be the case that it would be more effective to establish a partnership with a larger archivally established institution, or to contract a storage service provider. Nonetheless, the approach above is possible.

7.7.5 Multiple operator storage system

7.7.5.1  For any number of connections greater than two, a networked system of data storage and backup should be implemented. Such a networked system allows access to multiple users in accordance with the rules set down by the data management system. Small scale networks are relatively common and, with the right level of knowledge, easy and affordable to implement. Reasonable quantities of storage can be achieved with an enterprise level attached storage device. Storage technologies and products can be split into three main types: direct- attached storage (DAS), network-attached storage (NAS) and the storage area network (SAN). NAS has better performance and scalability than DAS and it is cheaper and simpler to configure than SAN. NAS technology is, from a cost benefit view, the most appropriate scalable technology for system of the size under discussion.

7.7.5.2  Most low cost NAS devices exhibit reduced bandwidth when compared to the more expensive devices resulting in slower access times, or a lower number of allowable simultaneous access availability. This should present no major problem to smaller collection as the requirement for simultaneous access remains low, especially if MP3 derivatives of the preservation master copies are used for access.

7.7.5.3  A typical small scale networked storage system may comprise of a server class desktop computer connected to a NAS device. The NAS would have the capability of mounting multiple hard disks in a RAID array. An average low cost NAS would hold between 0.5 and 20 terabytes of disk storage (noting the penalty for RAID is less storage than that indicated by the raw disk size). The digital audio workstations (DAW) access the NAS via an Ethernet switch or similar device which, if configured properly, has the effect of separating the storage facility from the office LAN (local area network) and improves the security of the storage facility. The HDDs would be backed up onto data tape.