Difference between revisions of "Twol rules in lttoolbox"

From Apertium
Jump to navigation Jump to search
(Add tag meanings)
(Organise page)
Line 6: Line 6:
 
*Every rule in the dictionary file must be properly compatible with the the HFST twolc engine and must not result in any ambiguities.
 
*Every rule in the dictionary file must be properly compatible with the the HFST twolc engine and must not result in any ambiguities.
 
*The xml tags must be well defined for archiphonemes and rules and must be distinct from the other existing tags in lttoolbox.
 
*The xml tags must be well defined for archiphonemes and rules and must be distinct from the other existing tags in lttoolbox.
*Every rule should have a comment giving the input from the morphotactics no exceptions
+
*Every rule entry should have comments adequate enough to give a brief understanding of morphophonological transformations performed by the twol compiler.
   
 
==Design==
 
==Design==
  +
*The design is still in the development stage and may need significant modifications after it is implemented on the existing language pairs.
  +
*The design must be robust enough to support
   
{|class=wikitable
 
! Tag/Symbol !! Meaning
 
|-
 
| '''ar''' || archiphoneme
 
|-
 
| '''alpha''' || alphabet
 
|-
 
| '''set''' || set
 
|-
 
| '''m''' || morphotactic side
 
|-
 
| '''s''' || surface side
 
|-
 
| '''l_c''' || left context
 
|-
 
| '''r_c''' || right context
 
|-
 
|}
 
   
 
==Archiphonemes==
 
==Archiphonemes==
Line 37: Line 21:
 
</archiphoneme>
 
</archiphoneme>
 
</pre>
 
</pre>
  +
 
{|class=wikitable
 
! Tag/Symbol !! Meaning
 
|-
 
| '''ar''' || archiphoneme
 
|-
 
| '''alpha''' || alphabet
 
|-
 
|}
   
 
==Sets==
 
==Sets==
Line 46: Line 39:
 
</sets>
 
</sets>
 
</pre>
 
</pre>
  +
  +
{|class=wikitable
  +
! Tag/Symbol !! Meaning
 
|-
  +
| '''set''' || set/group of alphabets
 
|-
 
| '''n''' || set name
 
|-
  +
| '''alpha''' || alphabet
 
|-
  +
|}
   
 
==Twol Rules==
 
==Twol Rules==
Line 65: Line 69:
 
<rule c="Back vowel harmony for archiphoneme A">
 
<rule c="Back vowel harmony for archiphoneme A">
 
<m><ar n="A"></m><s>a</s>
 
<m><ar n="A"></m><s>a</s>
<context constraint="n"><l_c><set n="BackVow"></l_c><r_c></r_c></context>
+
<context constraint="ne"><l_c><set n="BackVow"></l_c><r_c></r_c></context>
 
</rule>
 
</rule>
 
</rules>
 
</rules>
 
</pre>
 
</pre>
  +
  +
{|class=wikitable
  +
! Tag/Symbol !! Meaning
  +
|-
  +
| '''rule''' || twol rule
  +
|-
  +
| '''c''' || comment
  +
|-
 
| '''m''' || morphotactic side
 
|-
 
| '''s''' || surface side
  +
|-
  +
| '''context''' || context for transformation
  +
|-
  +
| '''constraint''' || direction constraint
  +
|-
  +
| '''f''' || a:b => _ ; If the symbol pair a:b appears it must be in context _
  +
|-
  +
| '''b''' || a:b <= _ ; If lexical a appears in the context _ then it must correspond to surface b
  +
|-
  +
| '''e''' || a:b <=> _ ; Lexical a always corresponds to b in context _
  +
|-
  +
| '''ne''' || a:b /<= _ ; Lexical a never corresponds to b in context _
  +
|-
 
| '''r_c''' || right context
  +
|-
 
| '''l_c''' || left context
  +
|-
  +
|}

Revision as of 08:49, 20 May 2018

Current Status: In Progress
Project: Extend lttoolbox to have the power of HFST

Guidelines

  • Every rule in the dictionary file must be properly compatible with the the HFST twolc engine and must not result in any ambiguities.
  • The xml tags must be well defined for archiphonemes and rules and must be distinct from the other existing tags in lttoolbox.
  • Every rule entry should have comments adequate enough to give a brief understanding of morphophonological transformations performed by the twol compiler.

Design

  • The design is still in the development stage and may need significant modifications after it is implemented on the existing language pairs.
  • The design must be robust enough to support


Archiphonemes

<archiphoneme>
  <ar n="A" alpha="ae"/>
  <ar n="B" alpha="bcd"/>
</archiphoneme>
Tag/Symbol Meaning
ar archiphoneme
alpha alphabet

Sets

<sets>
  <set n="Vowels" alpha="aeiou"/>
  <set n="BackVow" alpha="bcdfg"/>
</sets>
Tag/Symbol Meaning
set set/group of alphabets
n set name
alpha alphabet

Twol Rules

<rules>
  <rule c="Back vowel harmony for archiphoneme A">
    <m><ar n="A"></m><s>a</s>
    <context constraint="e"><l_c><set n="BackVow"></l_c><r_c></r_c></context>
  </rule>
  <rule c="Only hyphen in vowel boundaries and caps">
    <m><ar n="hyph?"></m><s>-</s>
    <context constraint="f"><l_c><set n="Vowels"></l_c><r_c></r_c></context>
  </rule>
  <rule c="Back vowel harmony for archiphoneme A">
    <m><ar n="A"></m><s>a</s>
    <context constraint="b"><l_c><set n="BackVow"></l_c><r_c></r_c></context>
  </rule>
  <rule c="Back vowel harmony for archiphoneme A">
    <m><ar n="A"></m><s>a</s>
    <context constraint="ne"><l_c><set n="BackVow"></l_c><r_c></r_c></context>
  </rule>
</rules>
Tag/Symbol Meaning
rule twol rule
c comment
m morphotactic side
s surface side
context context for transformation
constraint direction constraint
f a:b => _ ; If the symbol pair a:b appears it must be in context _
b a:b <= _ ; If lexical a appears in the context _ then it must correspond to surface b
e a:b <=> _ ; Lexical a always corresponds to b in context _
ne a:b /<= _ ; Lexical a never corresponds to b in context _
r_c right context
l_c left context