The Importance of File Formats

Managing file formats is an important topic to consider in digital preservation. In a broader context, one needs to study the application and implications of digital file formats. A full listing of recommended formats from the Library of Congress is available online. The Library of Congress’s recommended formats are based on seven sustainability factors. These include:

  • Disclosure – specifications and tools for validating the integrity and accessibility of the format exist. You can now find out how information is encoded as bits or bytes.
  • Adoption – the format is widely used. If everyone is using it, tools will be available for migration and emulation.
  • Transparency – It’s easy to analyze the format using basic tools, such as human readability. Information is not encrypted or compressed.
  • Self-Documentation – the format allows you to add metadata directly to the record. You don’t have to have a program or a database to find out what the record is.
  • External Dependencies – How much hardware or software do you need to access the format? The less specialized hardware or software you need, the better.
  • Impact of Patents – Patents could make it harder to open or migrate formats. Less of a worry with formats that are widely adopted.
  • Technical Protection Mechanisms – Formats should not be tied to a particular vendor or program. The format should be accessible regardless of the system to which it was originally uploaded.

The main content types of file formats are images, video, audio, and text. ISO compliant formats for these types of materials include:

  • PDF/A
    • Plain Text
    • XML
    • TIFF
    • JPEG2000

If you have several different file formats and versions of those formats (numerous different versions of PDF, Word, and Image formats), your digital preservation strategy should alleviate the effects of obsolescence and propagation. Strategies include file migration, emulation, normalization, and developing an institutional policy of only using certain file formats.

Ransomware Can Hold Your Records “Hostage”

In another unfortunate trend for Ohio in 2024, Wood County had experienced a ransomware attack that has prevented them from accessing their electronic records management system. As seen in the article found here, while the attack is not impacting public services, the county is resorting to using pen and paper to record emergency calls as well as preventing them from accessing historical police records.

Just like water or a fire damaging paper records, your electronic records are vulnerable to disasters and disruptions to business like these cyberattacks. There are several things your office should keep in mind:

  1. Understand where your records are on your network as well as who has permission to those files. ARMA International has a great article on defining data maps found here. This will also help identify where your vital records are, those records integral to your business operations and should be recovered quickly.
  2. Have your IT routinely backup your electronic records as well as run updates to system software/antiviruses/network firewalls.
  3. Provide mandatory cybersecurity training to your office staff to educate them on identifying fraudulent requests and the steps to report on them.
  4. Clean up electronic records that have met their applicable records retention schedules and are no longer needed. The less files there are on your network, the less files that could be potentially stolen from your office.
  5. Finally, establish continuity of operations plan (COOP) in place to define the policies and procedures to respond to an emergency or disaster. Have a COOP plan in place will allow a swifter restart of your operations. FEMA has a brief brochure describing a COOP plan found here.

Re-Evaluating a Public Record

A recent ruling by the Ohio Supreme Court concerning email distribution lists for a township newsletter helped define a “public record” for something simple as a mailing list. Per the recent article from Court News Ohio found here, there were some back-and-forth disputes concerning a 2022 public records request for the township newsletter email distribution list. The requester originally had his request denied by the township, but he argued that the list is a public record and contained no exempted information as found in ORC 149.43. While the township claimed the list did not document the activities of the office and was developed/maintained by a vendor, in Hicks v. Union Township, the Supreme Court agreed that the list was a public record per the three elements of a public record as defined in ORC 149.011(G). Remember, a public record is:

  • “Any document, device, or item, regardless of physical form or characteristic, including an electronic record as defined in section 1306.01 of the Revised Code”,
  • “Created or received by or coming under the jurisdiction of any public office of the state or its political subdivisions”,
  • And “serves to document the organization, functions, policies, decisions, procedures, operations, or other activities of the office”.

As always, when working with public records requests, it is important to have your office’s legal counsel review the request, the responsive records, and the response sent.  The case can be found here: Hicks v. Union Twp. Clermont Cty. Bd. of Trustees.