Development ideas for dictionary format

The idea of this page is to collect ideas for how to expand the Apertium .dix format such that it could be a drop-in replacement for lexc. Currently it has many advantages over lexc: Convenient / easy validation, more restrictive syntax, support for multiword queues and inbuilt support for analysis/generation restrictions. The problem is that it doesn't support some useful features that lexc has, or not comfortably. Also it would be desirable to standardise on some of the typical lexc stuff, e.g. one way of writing the morpheme boundary, not 100.

Archiphonemes

Perhaps use entities ?

The option of just using <s> is pretty much out,

<e><p><l><s n="pron"/></l><r><s n="L"/><s n="A"/><s n="G"/><s n="I"/></r></p><par n="CASE"/></e>

For

%<pron%>:%>%{L%}%{I%}%{K%}%{I%} CASE ;

Something like:

<e><p><l><s n="pron"/></l><r>&L;&A;&G;&I;</r></p><par n="CASE"/></e>

Might be liveable ? These would then be converted by the compiler into {L}{A}{G}{I} tags ?

Morpheme boundary

Current tags:

<a> = "alarm"
<s> = "symbol"
<b> = "blank"
<j> = "join"
<g> = "group"

It's desirable that it be a single letter.

Available: c d f h k m n o q t u v w x y z

Development ideas for dictionary format

Contents

Archiphonemes

Morpheme boundary

Flags

Phonology

Further reading

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools