|
|
Some resources on markup of lexicons
Compiled in preparation for upcoming EMELD workshop
Gary Simons, 13 July 2002
Existing markup proposals:
- R.A.Amsler and F.W.Tompa, 1988.
An SGML-based
Standard for English Monolingual Dictionaries, Information in Text:
Proc. 4th Conf. of Univ. of Waterloo Centre for the New OED (October 26-28,
1988), pp. 61-80. The appendix
describes the proposed tags.
- TEI Guidelines, 1994. Chapter 12:
Print Dictionaries.
- John Bell and Steven Bird, 2000.
A
Preliminary Study of the Structure of Lexicon Entries. Proposed
DTD.
- Erjavec, T., Evans, R., Ide, N., Kilgarriff, A., 2000.
The CONCEDE
Model for Lexical Databases..Proceedings of the Second Language Resources
and Evaluation Conference (LREC), Athens, Greece, 355-62. Like TEI at
leaf-level, but more abstract approach to structure. An XSLT implementation of
interpreting structure and inheritance of attributes is given in: Ide, N.,
Kilgarriff, A., Romary, L. (2000).
A Formal
Model of Dictionary Structure and Content. Proceedings of Euralex 2000,
Stuttgart, 113-126
Lists of elements that need to be accounted for in markup:
- List of ~100 markup
fields and ~50 lexical functions from Coward, David F. and Charles E.
Grimes. 1994. Making dictionaries: A guide to lexicography and the
Multi-Dictionary Formatter. Waxhaw, North Carolina: Summer Institute of
Linguistics
- Gibbon, D., Peters, W., Wittenburg, P., (December 2001),
Metadata
Elements for Lexicon Descriptions, Version 1.0, MPI Nijmegen.
- Conceptual model of lexicon developed by SIL International for LinguaLInks
and FieldWorks. Start with
LexEntry;
see the attributes listed on this object and follow links to explore the
attributes of related objects. See also
UML
diagram for lexical database classes.
- Grimes, Joseph E. 1988. Information dependencies in lexical subentries. In
Martha W. Evens (ed.), Relational Models of the Lexicon: Representing
knowledge in semantic networks. Cambridge: Cambridge University Press. pp.
167-181.
Requirements and markup philosophy:
- Nancy Ide and others, 1992.
Principles for
encoding machine readable dictionaries, EURALEX'92 Proceedings.
Describes the rationale followed in developing the TEI markup for print
dictionaries.
- William Lewis, Scott Farrar, D. Terence Langendoen, 2001.
Building
a Knowledge Base of Morphosyntactic Terminology, Proceedings of the IRCS
Workshop on Linguistic Databases. Articulates the philosophy of markup that has
been set for EMELD: use mapping to a common linguistic ontology in order to be
able to support the preferred markup schemes of contributing linguists.
- Nancy Ide, Laurent Romary, 2001.
Standards
for Language Resources , Proceedings of the IRCS Workshop on Linguistic
Databases. Advocates mapping to an abstract markup with standardized semantics.
- Wittenburg, P., Peters, W. and Drude, S.,
Analysis of Lexical Structures
from Field Linguistics and Language Engineering. In: Proceedings of
LREC2002, Las Palmas, 2002. Concludes by advocating an "Abstract Lexicon
Model" and lists basic requirements for such a model.
- Dafydd Gibbon, 2001.
On
lexical objects and their properties: A contribution to the 'MetaLex'
requirements specification for spoken language lexicon documentation. The
distinction between macrostructure and microstructure is particularly useful.
A taxonomy for classifying types of lexical resources:
|