Difference between revisions of "Archiphonemes"

Latest revision as of 16:22, 26 September 2016

Guidelines[edit]

Archiphonemes should be a single character.
Archiphonemes in lexc should be encased in { and }.
Archiphonemes should be declared in the Multichar_Symbols section in the header of the file after the grammatical tags, with a comment giving their possible forms.
If the archiphoneme is subject to deletion, it should be written in lower case, e.g. {s}
If the archiphoneme has a range of default surface forms (even if rarely subject to deletion), it should be written in upper case, e.g. {A}
If the archiphoneme is always deleted, it may consist of more than one character, e.g. {dup}. This is, however, advised against.

Common archiphonemes[edit]

Frequently asked questions[edit]

Why use {C} and not ^C ?

<spectie> was thinking about {A} over ^A
<Flammie> good
<spectie> and worked out a nice argument for it (aside from pure aesthetics): 
<spectie> other programs (e.g. morphological segmenters) parsing the output with {A} don't need to know about multicharacter symbols 
<spectie> compare: 
<spectie> foo{A}z{A}l
<spectie> foo^Az^Al
<spectie> with the first you know where the symbol ends
<spectie> in the second you do not know
<spectie> it may be ^Az and ^Al or ^A z ^A l

@@ Line 1: / Line 1: @@
+==Guidelines==
-==Standards for archiphonemes==
 * Archiphonemes should be a single character.
@@ Line 5: / Line 5: @@
 * Archiphonemes should be declared in the <code>Multichar_Symbols</code> section in the header of the file after the grammatical tags, with a comment giving their possible forms.
 * If the archiphoneme is subject to deletion, it should be written in lower case, e.g. <code>{s}</code>
-* If the archiphoneme is never deleted, it should be written in upper case, e.g. <code>{A}</code>
+* If the archiphoneme has a range of default surface forms (even if rarely subject to deletion), it should be written in upper case, e.g. <code>{A}</code>
+* If the archiphoneme is always deleted, it ''may'' consist of more than one character, e.g. <code>{dup}</code>. This is, however, advised against.
+==Common archiphonemes==
+==Frequently asked questions==
+; Why use {C} and not ^C ?
+<pre>
+<spectie> was thinking about {A} over ^A
+<Flammie> good
+<spectie> and worked out a nice argument for it (aside from pure aesthetics):
+<spectie> other programs (e.g. morphological segmenters) parsing the output with {A} don't need to know about multicharacter symbols
+<spectie> compare:
+<spectie> foo{A}z{A}l
+<spectie> foo^Az^Al
+<spectie> with the first you know where the symbol ends
+<spectie> in the second you do not know
+<spectie> it may be ^Az and ^Al or ^A z ^A l
+</pre>
 [[Category:Terminology]]
+[[Category:HFST]]
+[[Category:Writing dictionaries]]
+[[Category:Documentation in English]]

Difference between revisions of "Archiphonemes"

Latest revision as of 16:22, 26 September 2016

Guidelines[edit]

Common archiphonemes[edit]

Frequently asked questions[edit]

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools