Difference between revisions of "Lttoolbox-java/Flag diacritics"

From Apertium
Jump to navigation Jump to search
(Created page with ' This describes an experimental mode for lttoolbox-java which was made in december 2010 but never brought into use. The idea is to add 'flag' symbols that must match in the …')
 
Line 1: Line 1:
 
 
This describes an experimental mode for [[lttoolbox-java]] which was made in december 2010 but never brought into use.
 
This describes an experimental mode for [[lttoolbox-java]] which was made in december 2010 but never brought into use.
   
Line 6: Line 5:
 
Given the symbols
 
Given the symbols
   
  +
<pre>
 
<sdef n="mi:1" />
 
<sdef n="mi:1" />
 
<sdef n="mi:0" />
 
<sdef n="mi:0" />
 
<sdef n="bi:0" />
 
<sdef n="bi:0" />
 
<sdef n="bi:1" />
 
<sdef n="bi:1" />
  +
</pre>
   
 
A given expansion will be pruned away if there are conflicting flag symbols. For example an expansion with both <mi:1> and <mi:0> will be pruned awway.
 
A given expansion will be pruned away if there are conflicting flag symbols. For example an expansion with both <mi:1> and <mi:0> will be pruned awway.
Line 15: Line 16:
 
A <mi:1> and <bi:0> won't be pruned away as these flags doesen't conflict.
 
A <mi:1> and <bi:0> won't be pruned away as these flags doesen't conflict.
   
Her is an example from [http://apertium.svn.sourceforge.net/viewvc/apertium/trunk/lttoolbox-java/test/org/apertium/lttoolbox/FlagMatchingTest.java?view=markup]:
+
Here is an example from [http://apertium.svn.sourceforge.net/viewvc/apertium/trunk/lttoolbox-java/test/org/apertium/lttoolbox/FlagMatchingTest.java?view=markup]:
   
<nowiki>
+
<pre><nowiki>
 
<pardef n="prefix">
 
<pardef n="prefix">
 
<!-- 'bi-' is the prefix for indicative -->
 
<!-- 'bi-' is the prefix for indicative -->
Line 40: Line 41:
 
</pardef>
 
</pardef>
   
</nowiki>
+
</nowiki></pre>
   
   
 
Now, if the 2 pardefs are combined in, say
 
Now, if the 2 pardefs are combined in, say
<nowiki>
+
<pre><nowiki>
   
 
<e lm="khodan"> <par n="prefix"/><i>khoda</i><par n="khoda/n__vblex"/></e>
 
<e lm="khodan"> <par n="prefix"/><i>khoda</i><par n="khoda/n__vblex"/></e>
   
</nowiki>
+
</nowiki></pre>
 
we don't get all the 3x7 possibilites because conflicting flags are ruled out.
 
we don't get all the 3x7 possibilites because conflicting flags are ruled out.
   
 
Effectively we get only:
 
Effectively we get only:
   
<nowiki>
+
<pre><nowiki>
 
bikhodam:<bi:1>khodan<mi:0><xp:0><vblex><pri><p1><sg>
 
bikhodam:<bi:1>khodan<mi:0><xp:0><vblex><pri><p1><sg>
 
bikhodaš:<bi:1>khodan<mi:0><xp:0><vblex><pri><p2><sg>
 
bikhodaš:<bi:1>khodan<mi:0><xp:0><vblex><pri><p2><sg>
Line 61: Line 62:
 
mikhoda:<mi:1>khodan<bi:0><xp:0><vblex><prs><p3><sg>
 
mikhoda:<mi:1>khodan<bi:0><xp:0><vblex><prs><p3><sg>
 
khodan:<xp:1>khodan<xp:1><bi:0><mi:0><vblex><inf>
 
khodan:<xp:1>khodan<xp:1><bi:0><mi:0><vblex><inf>
</nowiki>
+
</nowiki></pre>
   
 
The [[lttoolbox-java]] processor automatically removes flag symbols from the output, so in the end we get just
 
The [[lttoolbox-java]] processor automatically removes flag symbols from the output, so in the end we get just
   
<nowiki>
+
<pre><nowiki>
 
bikhodam:khodan<vblex><prs><p1><sg>
 
bikhodam:khodan<vblex><prs><p1><sg>
 
bikhodaš:khodan<vblex><prs><p2><sg>
 
bikhodaš:khodan<vblex><prs><p2><sg>
Line 73: Line 74:
 
mikhoda:khodan<vblex><pri><p3><sg>
 
mikhoda:khodan<vblex><pri><p3><sg>
 
khodan:khodan<vblex><inf>
 
khodan:khodan<vblex><inf>
</nowiki>
+
</nowiki></pre>
   
   
Line 80: Line 81:
 
Here are the relevant flags for [[lttoolbox-java]] processor.
 
Here are the relevant flags for [[lttoolbox-java]] processor.
   
<nowiki>
+
<pre><nowiki>
 
$ lt-proc-j
 
$ lt-proc-j
 
-f: match flags (experimental)
 
-f: match flags (experimental)
 
-S: show hidden control symbols (for flagmatch and compounding)
 
-S: show hidden control symbols (for flagmatch and compounding)
</nowiki>
+
</nowiki></pre>
   
   

Revision as of 13:32, 5 March 2012

This describes an experimental mode for lttoolbox-java which was made in december 2010 but never brought into use.

The idea is to add 'flag' symbols that must match in the lttoolbox expansion.

Given the symbols

<sdef n="mi:1" /> 
<sdef n="mi:0" />
<sdef n="bi:0" />
<sdef n="bi:1" />

A given expansion will be pruned away if there are conflicting flag symbols. For example an expansion with both <mi:1> and <mi:0> will be pruned awway.

A <mi:1> and <bi:0> won't be pruned away as these flags doesen't conflict.

Here is an example from [1]:

<pardef n="prefix">
  <!-- 'bi-' is the prefix for indicative -->
  <e>       <p><l>bi</l>        <r><s n="bi:1"/></r></p></e>

  <!-- 'mi-' is the prefix for subjunctive -->
  <e>       <p><l>mi</l>        <r><s n="mi:1"/></r></p></e>

  <e>       <p><l></l>          <r><s n="xp:1"/></r></p></e>
</pardef>

<pardef n="khoda/n__vblex">
  <e>       <p><l>m</l>         <r>n<s n="xp:0"/><s n="mi:1"/><s n="bi:0"/><s n="vblex"/><s n="pri"/><s n="p1"/><s n="sg"/></r></p></e>
  <e>       <p><l>m</l>         <r>n<s n="xp:0"/><s n="mi:0"/><s n="bi:1"/><s n="vblex"/><s n="prs"/><s n="p1"/><s n="sg"/></r></p></e>

  <e>       <p><l>š</l>         <r>n<s n="xp:0"/><s n="mi:1"/><s n="bi:0"/><s n="vblex"/><s n="pri"/><s n="p2"/><s n="sg"/></r></p></e>
  <e>       <p><l>š</l>         <r>n<s n="xp:0"/><s n="mi:0"/><s n="bi:1"/><s n="vblex"/><s n="prs"/><s n="p2"/><s n="sg"/></r></p></e>

  <e>       <p><l></l>          <r>n<s n="xp:0"/><s n="mi:1"/><s n="bi:0"/><s n="vblex"/><s n="pri"/><s n="p3"/><s n="sg"/></r></p></e>
  <e>       <p><l></l>          <r>n<s n="xp:0"/><s n="mi:0"/><s n="bi:1"/><s n="vblex"/><s n="prs"/><s n="p3"/><s n="sg"/></r></p></e>
  <e>       <p><l>n</l>         <r>n<s n="xp:1"/><s n="vblex"/><s n="inf"/></r></p></e>
</pardef>


Now, if the 2 pardefs are combined in, say


<e lm="khodan">          <par n="prefix"/><i>khoda</i><par n="khoda/n__vblex"/></e>

we don't get all the 3x7 possibilites because conflicting flags are ruled out.

Effectively we get only:

bikhodam:<bi:1>khodan<mi:0><xp:0><vblex><pri><p1><sg>
bikhodaš:<bi:1>khodan<mi:0><xp:0><vblex><pri><p2><sg>
bikhoda:<bi:1>khodan<mi:0><xp:0><vblex><pri><p3><sg>
mikhodam:<mi:1>khodan<bi:0><xp:0><vblex><prs><p1><sg>
mikhodaš:<mi:1>khodan<bi:0><xp:0><vblex><prs><p2><sg>
mikhoda:<mi:1>khodan<bi:0><xp:0><vblex><prs><p3><sg>
khodan:<xp:1>khodan<xp:1><bi:0><mi:0><vblex><inf>

The lttoolbox-java processor automatically removes flag symbols from the output, so in the end we get just

bikhodam:khodan<vblex><prs><p1><sg>
bikhodaš:khodan<vblex><prs><p2><sg>
bikhoda:khodan<vblex><prs><p3><sg>
mikhodam:khodan<vblex><pri><p1><sg>
mikhodaš:khodan<vblex><pri><p2><sg>
mikhoda:khodan<vblex><pri><p3><sg>
khodan:khodan<vblex><inf>


Usage

Here are the relevant flags for lttoolbox-java processor.

$ lt-proc-j 
  -f:   match flags (experimental)
  -S:   show hidden control symbols (for flagmatch and compounding)


Further reading

http://apertium.svn.sourceforge.net/viewvc/apertium/trunk/lttoolbox-java/testdata/flag_matching/persian.dix?view=markup


http://apertium.svn.sourceforge.net/viewvc/apertium/trunk/lttoolbox-java/testdata/flag_matching/persian2.dix?view=markup