Digital preservation in a webinar, part two

Last week (one week after the first webinar), I attended the second Introduction to Digital Preservation webinar, which is hosted by ASERL (Association of Southeastern Research Libraries).

[For more details and to register for the remaining two webinars, see the project webpage]

This webinar was titled: “Forbearing the Digital Dark Age: Capturing Metadata for Digital Objects.” Chris Dietrich, from the National Park Service, presented, with help from Jodi DeRidder and John Berger.

My first impression was that the polls embedded in the presentation slides were pretty neat and convenient, and some day I’m going to learn how to do that.

To get back to the main topic, digital preservation, here are my highlights from the session.

Chris Dietrich described what he meant by the “digital dark age”: the “meantime” while our understanding and preservation strategies catch up to our technology, until which we will remain unable to fully capture our content.

Metadata

  • Is like card catalog information
  • Can be managed, shared, preserved
  • Is important because digital objects will be with us a long time, and metadata adds value to objects
  • With no idea how our digital objects will be used in the future, the more information we provide about these objects, the more valuable they will be

Categories of metadata

  • Descriptive – discovery, understanding (i.e. the card catalog information)
  • Administrative – management (usually in repositories), e.g. access restrictions
  • Structural – storage, presentation, logical/physical components (e.g. HasPart, IsPartOf, IsRelatedTo in Dublin Core)
  • Technical – properties of the file itself, instrument settings, e.g. maker and model of camera for photographs, when the object was created
  • Other – rights, preservation, geospatial

NB: Chris Dietrich recommended not worrying about whether every metadata fit into its appropriate category – these categories are just guidelines

How metadata is captured

  1. Embedded
  2. External

Embedded metadata

  • Advantage: it travels with the file and can be removed from the file
  • Not all metadata may be appropriate with the file and/or for certain audiences
  • Location is important to pull out before sharing, as it can be sensitive data (redacting can be done when it is logically built into the management system, or it can be entered manually)
  • Make sure what is in the file is synced with the information in the repository!

External metadata

  • Is usually in companion or “side car” documents, in spreadsheets, XML
  • Is not part of the file or object itself
  • Advantage: is easily edited in bulk
  • Disadvantage: is easily orphaned or misplaced, separated from the original file/object
  • Needs to be synced with the repository and original

Photographic metadata

Types of photographic files

  • jpeg – lower resolution copy
  • TIFF – higher resolution copy
  • RAW – digital negative (most formats are proprietary, these are often in archives, and are used for analysis)

Types of photographic metadata

  • EXIF – technical, descriptive
  • XMP (Extensible Metadata Platform) – Adobe product (descriptive); integrated into jpegs – is starting to see wider adoption, used in addition to EXIF
  • IPTC – subsumed by XMP – originated with photojournalism

Tools to get at embedded metadata

1. Windows Explorer (ubiquitous)

  • Performs batch operations (edit titles, keywords, projects)
  • Allows metadata input and editing and file renaming (although this function is clumsy)
  • Discovery: basic searching
  • Limited functionality

2. Proprietary tools: GPS photos

  • Batch operations
  • Watermarking
  • Multiple outputs
  • “Geospatial-centric”
  • Useful for photography with GPS and embedded coordinates

3. Open source/free tools: IExifPro

  • View/edit all EXIF metadata

4. Open source/free tools: Windows Live Photo Gallery

  • “Prep and publish” software
  • Is a free download

5. Source for open source/free/shareware: Source Forge

Photographic metadata standards

  • Dublin Core
  • Federal Geographic Data Committee (geospatial metadata)

Required elements (at NPS)

  1. Title – who, what, where, when
  2. Create date – born or digitized
  3. Contact info – photographer/steward
  4. Access constraints – copyright, privacy
  5. Constraints info – describe the constraints for access, etc.
  6. Place description – place name
  7. NPS Unit Info – local

NB: Save and archive things (master or original copies) at the highest resolution that is manageable, because increasing the quality/resolution of a lower-resolution photograph will not work.

Document metadata

Document types

  • Word
  • PDF – an uneditable snapshot, which can contain other types of document, and may not be a document (i.e. sometimes they have photos or videos)

Dublin Core standards

  • Flat, flexible, and easy to use
  • For any object type
  • Can be imprecise
  • Simple and qualified implementations, which can be extended to add more specificity

Library of Congress TextMD

  • Technical metadata for text objects

Document metadata tools

  • Windows Explorer
  • MS Word – for individual file metadata editing
  • Adobe Acrobat – individual file metadata editing; inherits metadata from Word

Suggested workflows

  1. Inventory
  2. Prioritize
  3. Categorize
  4. Describe
  5. Back up
  6. Archive
  7. Share

Advertisements

One thought on “Digital preservation in a webinar, part two

  1. Pingback: Digital preservation in a webinar, part four | Cultural Heritage and Information

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s