Workshop on
The Digitization of Language Data:
The Need for Standards

Working Group on Metadata










New: Working Group Responses

Request for the Metadata Working Group:

The LINGUIST List has secured NSF funding to become a central metadata server for the discipline--the "union catalog" for the Open Language Archives Community (OLAC). (The ARC cross-archive searching service [arc.cs.odu.edu] is an example of such a cross-archive search service.)What we are soliciting from you, the members of the Metadata Working Group, is feedback on the usefulness of the metadata set that our software will be designed to handle. Essentially, we would like you to try it out and write a brief (1 page or less) report which we can use as a springboard for discussion at the workshop.

Details of our request follow, along with more information on what metadata is and why we are interested in it.

As a member of OLAC, we are committed to using and helping to develop the simplified OLAC metadata set. We also hope to be a portal to, and perhaps a collector of, metadata in the more comprehensive ISLE format:

About the proposed sets:

  • OLAC Metadata Set (See, in particular, the Elements section): The OLAC metadata set has 15 elements, or categories of information. They were adapted from the Dublin Core Metadata Element Set, a set developed in Dublin, Iowa (yes) and intended to describe a broad range of resources. The OLAC Metadata Set qualifies and refines the DC elements to make them better descriptors of language resources.
  • ISLE Metadata Set (pdf) (International Standard for Language Engineering, Metadata Working Group, Max Planck Institute, Nijmegen): The goal of the EAGLES/ISLE Meta Data Initiative is to propose a standard of meta-data descriptions of Multi-Media/Multi-Modal Language resources. It is a joint US/EU project, and its US version is equivalent to the OLAC set. However, the ISLE Metadata Set in its European version is much longer than the OLAC set and is intended to describe a "session," (e.g., an interview) which may give rise to the creation of multiple resources, e.g. videotapes, audiotapes, and transcripts.
  • DOBES Metadata Set (pdf)

Although a great deal of work has gone into these proposals, there is no substitute for using the metadata to describe real resources. So we would be most grateful for your participation in a practical experiment now, while our facilities are in the design stages:

  • Try it out: Please take some of your data or language documentation and simply try to describe it using one of these sets. If you have no suitable resource to describe, just look at one or both of the sets and draw some conclusions. We are interested in the answers to questions like:
    • Are the categories (elements) clearly named and described?
    • Do they allow you to enter the right information--i.e, the information that would help other linguists decide whether or not to retrieve your resource? (Remember that metadata is intended to facilitate search and retrieval, not necessarily to be a complete description of the resource.)
    • Are these the categories you would find useful in retrieving other data or documentation?
    • The OLAC document contains information on work yet to do. Is there anything you would like to add or offer in response to these plans?
  • Write a brief (1 page) report of your results: If you will email your report to Helen Aristar-Dry (hdry@linguistlist.org) by June 14, we will put it on the website prior to the workshop. Otherwise, we ask you to bring 12 copies of your report to the workshop. Your conclusions and suggestions will be the springboard for the discussion in the Metadata Working Group sessions.
A point to remember as you work: metadata isn't meant to be a theory or ontology of the linguistic world. It's just a set of convenient categories for finding useful information. These categories are intended to be easy for a non-specialist to apply to a newly created resource, and they should not need to be updated very frequently. The goal of metadata is usefulness, not perfection.

Links you may find useful:


Workshop homepage | Workshop Proposal | Advance Reading | Contact the Organizers