Scholars Portal Registry of File Formats
SP requires immediate identification of the type of file format sent by the publisher in order to help mitigate risk posed by format obsolescence. To this end, SP employs the use of DROID, JHOVE and PRONOM through the FITS software package.
While Scholars Portal is not dependent on or restricted to any particular format or group of formats, it aims to use well-known, widely accepted formats that support long-term preservation. If a publisher wants to use a specific format not meeting these criteria, an agreement must be reached between the publisher and SP.
2. Implementation Examples
- Scholars Portal makes use of DROID for format identification during the ingestion process where a file format is associated with each file.
- DROID (digital record object identification) is a software tool developed by the National Archives to perform automated batch identification of file formats. It is a platform-independent Java tool, which is freely available to download under an open source license.1
- Scholars Portal employs JHOVE as a tool for further format-specific identification, validation and characterization of the file.
- JHOVE (JSTOR/Harvard Object Validation Environment) is an extensible framework for format validation created by a collaboration between JSTOR and Harvard University Library.
- During the process of DROID identification, a file format is associated with each file, and, where possible, the file is linked to the format's entry in PRONOM, the British National Archive's format registry.
- PRONOM is a resource providing impartial and definitive information about the file formats, software products and other technical components required to support long-term access to electronic records and other digital objects of cultural, historical or business value.3
Example characterization and reference to format registry:
<PREMIS:format> <PREMIS:formatDesignation> <PREMIS:formatName>Acrobat PDF 1.4 - Portable Document Format</PREMIS:formatName> <PREMIS:formatVersion>1.4</PREMIS:formatVersion> </PREMIS:formatDesignation> <PREMIS:formatRegistry> <PREMIS:formatRegistryName>http://www.nationalarchives.gov.uk/pronom</PREMIS:formatRegistryName> <PREMIS:formatRegistryKey>fmt/18</PREMIS:formatRegistryKey> </PREMIS:formatRegistry> </PREMIS:format>
- Sourceforge.net. (2009). DROID. Retrieved from http://droid.sourceforge.net /
- Harvard University Library. (2009, February 25). JHOVE – JSTOR/Harvard Object Validation Environment. Retrieved from http://hul.harvard.edu/jhove
- The National Archives. The Technical Registry PRONOM. Retrieved from
4. Document History