Difference between revisions of "Lttoolbox-java/Flag diacritics"

From Apertium
Jump to navigation Jump to search
(Created page with ' This describes an experimental mode for lttoolbox-java which was made in december 2010 but never brought into use. The idea is to add 'flag' symbols that must match in the …')
(No difference)

Revision as of 12:37, 5 March 2012

This describes an experimental mode for lttoolbox-java which was made in december 2010 but never brought into use.

The idea is to add 'flag' symbols that must match in the lttoolbox expansion.

Given the symbols

<sdef n="mi:1" /> <sdef n="mi:0" /> <sdef n="bi:0" /> <sdef n="bi:1" />

A given expansion will be pruned away if there are conflicting flag symbols. For example an expansion with both <mi:1> and <mi:0> will be pruned awway.

A <mi:1> and <bi:0> won't be pruned away as these flags doesen't conflict.

Her is an example from [1]:

<pardef n="prefix"> <!-- 'bi-' is the prefix for indicative --> <e> <p><l>bi</l> <r><s n="bi:1"/></r></p></e> <!-- 'mi-' is the prefix for subjunctive --> <e> <p><l>mi</l> <r><s n="mi:1"/></r></p></e> <e> <p><l></l> <r><s n="xp:1"/></r></p></e> </pardef> <pardef n="khoda/n__vblex"> <e> <p><l>m</l> <r>n<s n="xp:0"/><s n="mi:1"/><s n="bi:0"/><s n="vblex"/><s n="pri"/><s n="p1"/><s n="sg"/></r></p></e> <e> <p><l>m</l> <r>n<s n="xp:0"/><s n="mi:0"/><s n="bi:1"/><s n="vblex"/><s n="prs"/><s n="p1"/><s n="sg"/></r></p></e> <e> <p><l>š</l> <r>n<s n="xp:0"/><s n="mi:1"/><s n="bi:0"/><s n="vblex"/><s n="pri"/><s n="p2"/><s n="sg"/></r></p></e> <e> <p><l>š</l> <r>n<s n="xp:0"/><s n="mi:0"/><s n="bi:1"/><s n="vblex"/><s n="prs"/><s n="p2"/><s n="sg"/></r></p></e> <e> <p><l></l> <r>n<s n="xp:0"/><s n="mi:1"/><s n="bi:0"/><s n="vblex"/><s n="pri"/><s n="p3"/><s n="sg"/></r></p></e> <e> <p><l></l> <r>n<s n="xp:0"/><s n="mi:0"/><s n="bi:1"/><s n="vblex"/><s n="prs"/><s n="p3"/><s n="sg"/></r></p></e> <e> <p><l>n</l> <r>n<s n="xp:1"/><s n="vblex"/><s n="inf"/></r></p></e> </pardef>


Now, if the 2 pardefs are combined in, say <e lm="khodan"> <par n="prefix"/><i>khoda</i><par n="khoda/n__vblex"/></e> we don't get all the 3x7 possibilites because conflicting flags are ruled out.

Effectively we get only:

bikhodam:<bi:1>khodan<mi:0><xp:0><vblex><pri><p1><sg> bikhodaš:<bi:1>khodan<mi:0><xp:0><vblex><pri><p2><sg> bikhoda:<bi:1>khodan<mi:0><xp:0><vblex><pri><p3><sg> mikhodam:<mi:1>khodan<bi:0><xp:0><vblex><prs><p1><sg> mikhodaš:<mi:1>khodan<bi:0><xp:0><vblex><prs><p2><sg> mikhoda:<mi:1>khodan<bi:0><xp:0><vblex><prs><p3><sg> khodan:<xp:1>khodan<xp:1><bi:0><mi:0><vblex><inf>

The lttoolbox-java processor automatically removes flag symbols from the output, so in the end we get just

bikhodam:khodan<vblex><prs><p1><sg> bikhodaš:khodan<vblex><prs><p2><sg> bikhoda:khodan<vblex><prs><p3><sg> mikhodam:khodan<vblex><pri><p1><sg> mikhodaš:khodan<vblex><pri><p2><sg> mikhoda:khodan<vblex><pri><p3><sg> khodan:khodan<vblex><inf>


Usage

Here are the relevant flags for lttoolbox-java processor.

$ lt-proc-j -f: match flags (experimental) -S: show hidden control symbols (for flagmatch and compounding)


Further reading

http://apertium.svn.sourceforge.net/viewvc/apertium/trunk/lttoolbox-java/testdata/flag_matching/persian.dix?view=markup


http://apertium.svn.sourceforge.net/viewvc/apertium/trunk/lttoolbox-java/testdata/flag_matching/persian2.dix?view=markup