Current Status: In Progress
Project: Extend lttoolbox to have the power of HFST
Guidelines
- Every rule in the dictionary file must be properly compatible with the the HFST twolc engine and must not result in any ambiguities.
- The xml tags must be well defined for archiphonemes and rules and must be distinct from the other existing tags in lttoolbox.
- Every rule entry should have comments adequate enough to give a brief understanding of morphophonological transformations performed by the twol compiler.
Design
- The design is still in the development stage and may need significant modifications after it is implemented on the existing language pairs.
- The design must be robust enough to support
Archiphonemes
<archiphoneme>
<ar n="A" alpha="ae"/>
<ar n="B" alpha="bcd"/>
</archiphoneme>
Tag/Symbol |
Meaning
|
ar |
archiphoneme
|
alpha |
alphabet
|
Sets
<sets>
<set n="Vowels" alpha="aeiou"/>
<set n="BackVow" alpha="bcdfg"/>
</sets>
Tag/Symbol |
Meaning
|
set |
set/group of alphabets
|
n |
set name
|
alpha |
alphabet
|
Twol Rules
<rules>
<rule c="Back vowel harmony for archiphoneme A">
<m><ar n="A"></m><s>a</s>
<context constraint="e"><l_c><set n="BackVow"></l_c><r_c></r_c></context>
</rule>
<rule c="Only hyphen in vowel boundaries and caps">
<m><ar n="hyph?"></m><s>-</s>
<context constraint="f"><l_c><set n="Vowels"></l_c><r_c></r_c></context>
</rule>
<rule c="Back vowel harmony for archiphoneme A">
<m><ar n="A"></m><s>a</s>
<context constraint="b"><l_c><set n="BackVow"></l_c><r_c></r_c></context>
</rule>
<rule c="Back vowel harmony for archiphoneme A">
<m><ar n="A"></m><s>a</s>
<context constraint="ne"><l_c><set n="BackVow"></l_c><r_c></r_c></context>
</rule>
</rules>
Tag/Symbol |
Meaning
|
rule |
twol rule
|
c |
comment
|
m |
morphotactic side
|
s |
surface side
|
context |
context for transformation
|
constraint |
direction constraint
|
f |
a:b => _ ; If the symbol pair a:b appears it must be in context _
|
b |
a:b <= _ ; If lexical a appears in the context _ then it must correspond to surface b
|
e |
a:b <=> _ ; Lexical a always corresponds to b in context _
|
ne |
a:b /<= _ ; Lexical a never corresponds to b in context _
|
r_c |
right context
|
l_c |
left context
|