Unification of metadix and parametrized dictionaries

From Apertium
Revision as of 14:42, 30 October 2007 by Mlforcada (talk | contribs) (New page: Different language-pair packages use different strategies to generate .dix dictionaries (monodix) and (bidix) from XML files using features not supported by the .dix format. The ob...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Different language-pair packages use different strategies to generate .dix dictionaries (monodix) and (bidix) from XML files using features not supported by the .dix format. The objectives of these new dix-like formats are:

  • being able to use parametrized paradigms (so that a general paradigm may be defined and used with small parametrized variations), as discussed in the metadix page
  • being able to generate different versions of a translator (for instance, for two different varieties of a language, such as Brazilian and European Portuguese) whose names could be ideally tied to mode names

There is currently a debate on a unification of these formats into a single metadix format which in turn could also be used to support other desirable features such as

  • having metadata (headers) in dictionaries which defines whether the dictionary is a bilingual or monolingual dictionary and the language pairs and modes it supports (perhaps this could be added to the basic .dix format

Here is a proposal (open to discussion) on the first two issues:

  • endowing the e element with a vnt (variant) attribute, so that the corresponding metadix entry will go to the generated .dix only if that variant is selected (entries without a vnt will go to the .dix unconditionally).
  • having a way to mark a block of entries with ...