In order to ensure the long-term integrity and reliability of information, SP uses a number of tools and procedures to detect bit corruption or loss. The repository uses widely accepted hashing techniques to generate digest values for new content and carries out regular, automated fixity checks on archived content. For each journal article, the repository generates and records MD5, SHA-1, and CRC32 values for each file associated with the article, including the bibliographic metadata. Digest values are stored in the preservation metadata, which is separate from the article's files. SP also records file sizes at byte scale.
In addition, the Pillar Axiom storage array has health-monitoring, diagnostic, and error-correction tools. The storage controllers will automatically report errors to the University of Toronto Libraries' Information Technology Services staff. The MarkLogic database performs consistency checks on bibliographic and preservation metadata.
Please see the complete text of the Fixity Check Procedures document for additional information. The Risk Analysis and Management Strategies document contains information about the Pillar array's health-monitoring and error-correction functionality.
Digital Preservation Policy Librarian
SP is in the process of generating digest values for content that was ingested before the repository adopted its fixity procedures. Until this process is complete, some files do not have digest values and are not candidates for regular fixity testing.
SP does not generate digest values for its preservation metadata files. In other words, there is no fixity check for the fixity values. It should be noted, however, that these files are subject to MarkLogic's internal consistency checking.
SP operates a rolling test of digest values. Fixity tests that produce a mismatch are automatically reported to staff through the repository's issue tracking system.