Guidelines
- Archiphonemes should be a single character.
- Archiphonemes in lexc should be encased in
{
and }
.
- Archiphonemes should be declared in the
Multichar_Symbols
section in the header of the file after the grammatical tags, with a comment giving their possible forms.
- If the archiphoneme is subject to deletion, it should be written in lower case, e.g.
{s}
- If the archiphoneme has a range of default surface forms (even if rarely subject to deletion), it should be written in upper case, e.g.
{A}
- If the archiphoneme is always deleted, it may consist of more than one character, e.g.
{dup}
. This is, however, advised against.
Common archiphonemes
Frequently asked questions
- Why use {C} and not ^C ?
<spectie> was thinking about {A} over ^A
<Flammie> good
<spectie> and worked out a nice argument for it (aside from pure aesthetics):
<spectie> other programs (e.g. morphological segmenters) parsing the output with {A} don't need to know about multicharacter symbols
<spectie> compare:
<spectie> foo{A}z{A}l
<spectie> foo^Az^Al
<spectie> with the first you know where the symbol ends
<spectie> in the second you do not know
<spectie> it may be ^Az and ^Al or ^A z ^A l