July 20, 2005

ALA Report: MODS, MARC & Metadata Interoperability

I recently attended an excellent program on metadata issues at the ALA Annual Conference in Chicago. This program was designed to fulfill multiple objectives: discuss the side-by-side use of MARC with XML-based metadata schemes; introduce MODS (Model Object Data Schema) as a workable, mature standard that is being used in the real world; and discuss strategies of metadata reuse.

June 27, 2005: ALA Annual Conference 2005
Sponsoring Organization ALCTS/LITA/NIRMIG

Dr. William Moen (University of North Texas);
Rebecca Guenther ( Library of Congress);
Ann Caldwell (Brown University);
Marty Kurth (Cornell University);
Terry Reese (Oregon State University)

Moen gave an introduction to the particulars of metadata, and laid out some of the intellectual concepts that underlay the difficulties in crosswalking data between schemas. He stated that until recently, metadata had been thought of as largely a systems problem. Moen argued for developing a view of metadata from a user perspective, making sure that data standards meeting search and retrieval requirements. He also said that he believes no one library metadata scheme will emerge, but that XML will become entrenched as the common syntax. Overall, libraries must think of themselves as “one node on the information network” – previously used to the highest position in a hierarchy, libraries must accept that in the new networked world of information, they are in a large pool and must distinguish themselves within that pool.

Moen’s presentation: http://www.unt.edu/wmoen/presentations.htm
Marty Kurth manages metadata for the Cornell libraries, and he shared his observations on managing large metadata projects, specifically reusing MARC data for digital projects that focused on pulling items from the monograph collections. This was especially interesting in light of the Cornell digital projects model, which operates as a consulting service somewhat separate from the library itself.

Kurth stated that the storehouse of MARC records was the research library’s main asset. A huge amount of work has been put into populating and maintaining this database. So it should be reused whenever possible. However, maintaining “separate-but-equal” data stores in different formats (i.e. MARC & XML) is a headache.

Kurth’s staff previously used a model in which IT staff at the library would use scripting to pull specific fields from the MARC records and output them into a database. These scripts would be re-run at specific intervals to refresh the database. Recently, they have moved to a model with metadata staff themselves using XSLT to do transformations. However, with the more efficient XSLT model, which allows the metadata staff to quickly see results and then tweak the stylesheets to improve them, comes documentation issues that did not exist with the more centralized scripting model. Kurth stressed the need to document and archive tools, templates, scripts, etc. that are used to convert data. Along with the metadata itself, these can provide an important record of decisions and strategies that have been built upon.

Kurth’s presentation: http://dspace.library.cornell.edu/handle/1813/1457
Guenther’s presentation was an introduction to the structure and use of MODS, paying particular attention to its ability to maintain hierarchical levels of description within a single record. As always, this capacity does not mean that an institution will have the resources to take full advantage of it.

Guenther’s presentation: http://www.loc.gov/standards/mods/presentations/ala2005-mods.htm
Caldwell runs a small metadata “shop” at Brown University that creates metadata for digital projects. They do so entirely in MODS records, most of them fairly minimal in detail, using homegrown templates in an application called NoteTab Pro. This program is a text editor that allows custom templates and error checking routines. Currently their image database runs on a custom platform, but they are looking at various proprietary solutions currently.

Caldwell’s presentation: http://dl.lib.brown.edu/staff/caldwell/MODSatBrown.ppt
Reese concluded the presentation by demonstrating a new version of his MARCedit software. This program has been enhanced extensively by Reese over the past several months, and he demonstrated its new features while discussing some of the reasons for his design decisions. Most important in the new program are the enhanced conversion tools. These tools convert records into MARCXML and then into the desired format. Reese discussed his strategy of using MARCXML as the matrix for conversion, based largely on the fact that it is as rich as MARC, providing lossless conversion back and forth, while being in a XML format allows for easier conversion into other, XML-based formats. Reese also demonstrated a tool that displays a MODS record as if it were a MARC record, allowing catalogers who are used to the latter an easy entry into the MODS world.
Reese’s presentation: http://oregonstate.edu/~reeset/presentations/ala/summer2005/ala_2005_mods.ppt
