User:David Nemeskey/CG XML brainstorming

From Apertium
< User:David Nemeskey
Revision as of 17:39, 27 June 2013 by David Nemeskey (talk | contribs) (Created page with 'This page lists my (and others') ideas of how the CG XML format could or should look like. == Sets and lists == The words ''set'' and ''list'' are used interchangeably in CG. T…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This page lists my (and others') ideas of how the CG XML format could or should look like.

Sets and lists

The words set and list are used interchangeably in CG. This is in contrast to how these term are used in CS, and partly to the commonsensical meanings of the words as well. The current planning process might be just the right time to fix this issue. I propose to say good-bye to list.

The (XML) tags below will be used throughout the grammar for specifying tags and sets in e.g. constraint conditions.

Item CG syntax XML syntax
Regular tag nom <tag>nom</tag>
Sequence tag (n pl) <seq><tag>n</tag><tag>pl</tag></seq>
Reading base-form "dog" <lemma>dog</lemma>
Word-form "<dogs>" <word>dogs</word>
Set (...) <set>...</set>

Observations:

  1. seq and set are very similar, which might be a problem when skimming through a CG
  2. I don't know if we even need set -- in the construction rules, you have to put sets to everywhere, and those will have separate XML tags anyway.

Delimiters

Probably the easiest of the bunch. <delimiters>(word forms, sets, etc.)</delimiters>