6.6 Access

6.6.1 Introduction  The OAIS Reference Model defines “access” as the entity that “provides the services and functions that support consumers in determining the existence, description, location and availability of information stored in the OAIS, and allowing consumers to request and receive information products.” In other words, access is the mechanisms and process where content is found and retrieved. IASA-TC 03 “The Safeguarding of the Audio Heritage: Ethics, Principles and Preservation Strategy” makes the point that “the primary aim of an archive is to ensure sustained access to stored information”. The preservation of the content is a prerequisite to sustained access to the content, and in a well planned archive access is a direct outcome of it.  In its simplest form, access is the ability to locate content and, in response to an authorised request, allow retrieval of the content for listening, or possibly, as long as the rights associated with a work allow it, creating a copy that can be taken away. In the connected digital environment access can be provided remotely. Access, however, is more than just the ability to deliver an item. Most technically constructed archival systems can deliver an audio file on request, but a true access system provides finding and searching capability, delivery mechanisms and allows interaction and negotiation regarding content. It adds a new dimension to access beyond that of conquering distance. In this new services based model of retrieval, access could be considered a dialogue between the provider’s system and the user’s browser.

6.6.2 Integrity in On Line and Off Line Access Environments  Prior to the existence of remote access in the online environment, such things as authenticity and integrity were established by individuals in the reading rooms and listening posts of the collecting institutions. The content was delivered by representatives of institutions whose reputation spoke for the integrity of the content. Original materials could be retrieved for examination if the copies were questioned.  The online environment still relies to some extent on the trusted nature of the collecting institution, however, an unambiguously original item can never be provided online, and the possibility of undetected tampering or accidental corruption exists within the archive and distribution network. To counter this, various systems exist which mathematically attest to the authenticity or integrity of an item or work.  Authenticity is a concern with knowing that something has originated from a particular source. The trusted nature of the institution creating the content attests to the processes, and a certificate authority is issued which a third party can use as a guarantee of authenticity.Various systems exist and are valuable where this could be an issue.  Integrity refers to a wish to know whether an item has been damaged or tampered with. Checksums represent the common way of dealing with integrity, and are valuable tools in both the archive and the distribution network (see 6.3.23 Integrity and Check sums). However, as is discussed in 6.3.23, checksums are fallible, and their use requires monitoring on behalf of the archive of latest developments.

6.6.3 Standards and Descriptive Metadata Detailed, appropriate, organised metadata is the key to broad exposure and effective access. In Chapter 3 Metadata, a detailed discussion of metadata in many of its forms and requirements is undertaken, and this should be referred to in developing a delivery system. Ambitious access facilities, using, for example map interfaces or timelines, will only function if there is metadata to support it in a structured and organised form.  The most cost effective way to manage and create the appropriate metadata is to ensure the requirements for all the components in the delivery system are established prior to the ingest of the content. In this way the metadata creation steps can be built into the pre-ingest and ingest workflows. The cost of creating a minimal set, as discussed in Section 7.4, is the extra task of adding and structuring the metadata in a system which has already been created.

6.6.4 Formats and Dissemination Information Packages (DIP)  The Dissemination Information Package (DIP) is the Information Package received by the Consumer in response to a request for content, or an order. The delivery system should also be able to deliver a result set or a report from a query.  Web developers and the access “industry” have developed delivery systems based, naturally, around delivery formats. Delivery formats are not suitable for preservation, and generally, preservation formats are not suitable for delivery. In order to facilitate delivery, separate access copies are created, either routinely, or “on demand” in response to a request. Content may be streamed, or downloaded in compressed delivery formats. The quality of the delivery format is generally proportional to its bandwidth requirements, and collection managers must make decisions about the type of delivery formats based on the user requirements and the infrastructure to support delivery. QuickTime and Real Media formats have proven to be popular streaming formats and MP3 (MPEG 1 Layer 3) a popular downloadable format which may also be streamed. There is no requirement to select only these formats for delivery, and many collection delivery systems provide a choice of formats to the user.  For some types of material it may be necessary to create two master WAV files: one, a preservation or archival master that replicates exactly the format and condition of the original the second, a dissemination master that may have been processed in order to improve the audio quality of the content. A second master will allow the creation of dissemination copy as required. It is expected that distribution formats will continue to change and evolve at a faster rate than master formats.

6.6.5 Search Systems and Data Exchange  The extent to which content can be discovered sets the limit on the amount of use of the material. In order to ensure broad usage it is necessary to expose content through various means.  Remote databases can be searched using Z39.50, a client-server protocol for searching and retrieving information. Z39.50 is widely used in the Library and Higher education sector, and its existence predates the web. Given the extent of its use, it is advisable to establish a Z39.50 compliant client server on databases. However, this protocol is being rapidly replaced in the web environment by SRU/SRW (Search/Retrieval via a URL and Search/Retrieve Web service respectively). SRU is a standard XML-focused search protocol for Internet search queries, utilizing CQL (Contextual Query Language), a standard syntax for representing queries (http://www.loc.gov/standards/sru/). SRW is a web service that provides a SOAP interface for queries in partnership with SRU.Various open source projects support SRU/SRW in relation to the major open source repository software such as DSPACE and FEDORA.  OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting) is a mechanism for repository interoperability. Repositories expose structured metadata via OAI-PMH which is aggregated and used to support queries on the content. OAI-PMH nodes can be incorporated into the common repositories. OAI-ORE (Object Reuse and Exchange) will be important for the sound and audiovisual archiving community as it addresses the very important requirement to be able to deal efficiently with compound information objects in synchronisation with Web architecture. It allows the description and exchange of aggregations of Web resources.”These aggregations, sometimes called compound digital objects, may combine distributed resources with multiple media types including text, images, data, and video”. http://www.openarchives.org/  In order for the sophisticated online environment to work it is necessary to have interoperable metadata and content. This means that there must be some shared understanding of the attributes included, a general schema which is able to operate in a variety of frameworks, and a set of protocols about exchanging content. This is best achieved, as is always in the digital environment, by adhering to the standards, schemas, frameworks and protocols recommended and avoiding proprietary solutions.

6.6.6 Rights and Permissions  It is important to note that all access is subject to the rights established in the items and the permission of the owner to use the content.Various rights management approaches exist, from “fingerprinting” the content, to managing the permissions of various individual to access, the physical separation of the storage environment. The particular implementation rights system will depend on the type of content, the technical infrastructure and the owner and user community and it is beyond the scope of this document to define or describe a particular approach.