Autoconcord
Contents
Making the bidix concord with the monodices
The apertium-dixtools package contains a tool for automatically make symbols (gender, number, ...) in the bidix agree with the monodices.
How does it work?
Some preparations are needed.
The tools looks in the monodices for a special autoconcord comment in the paradigms:
<pardef n="ackord__n" '''c="autoconcord:nt,sp"'''> <e> <p><l></l> <r><s n="n"/><s n="nt"/><s n="sp"/><s n="ind"/></r></p></e> <e> <p><l>et</l> <r><s n="n"/><s n="nt"/><s n="sg"/><s n="def"/></r></p></e> <e> <p><l>en</l> <r><s n="n"/><s n="nt"/><s n="pl"/><s n="def"/></r></p></e> </pardef> ... <e lm="avbrott"> <i>avbrott</i><par n="ackord__n"/></e> <pre> This comment makes all entries using paradigm ackord__n have the autoconcord symbols 'nt' and 'sp'. The bidix contains <e><p><l>avbrott<s n="n"/></l><r>afbrydelse<s n="n"/></r></p></e> The right dix have autoconcord symbols 'ut' and 'sgpl' for the lemma: <pre> <pardef n="abe__n" c="autoconcord:ut,sgpl"> <e> <p><l></l> <r><s n="n"/><s n="ut"/><s n="sg"/><s n="ind"/></r></p></e> <e> <p><l>n</l> <r><s n="n"/><s n="ut"/><s n="sg"/><s n="def"/></r></p></e> <e> <p><l>r</l> <r><s n="n"/><s n="ut"/><s n="pl"/><s n="ind"/></r></p></e> <e> <p><l>rne</l> <r><s n="n"/><s n="ut"/><s n="pl"/><s n="def"/></r></p></e> </pardef> ... <e lm="afbrydelse"> <i>afbrydelse</i><par n="abe__n"/></e> <pre> == What does it do? == Autoconcord will try to make the autoconcord symbols of left dix (nt,sp) concord with those of the right dix (ut,sgpl). It does so by pairing them one by one: nt-ut and sp-sgpl. Then it searches the bidix for paradigms with the special autoconcord comments "autoconcord:nt-ut" and "autoconcord:sp-sgpl": <pre> <pardef n="_nt_ut" c="autoconcord:nt-ut"> <e> <p><l><s n="nt"/></l><r><s n="ut"/></r></p></e> </pardef> <pardef n="_sp_sgpl" c="autoconcord:sp-sgpl"> <e r="LR"><p><l><s n="sp"/><s n="ind"/></l><r><s n="ND"/><s n="ind"/></r></p></e> <e r="RL"><p><l><s n="sp"/><s n="ind"/></l><r><s n="sg"/><s n="ind"/></r></p></e> <e r="RL"><p><l><s n="sp"/><s n="ind"/></l><r><s n="pl"/><s n="ind"/></r></p></e> <e> <p><l><s n="sg"/><s n="def"/></l><r><s n="sg"/><s n="def"/></r></p></e> <e> <p><l><s n="pl"/><s n="def"/></l><r><s n="pl"/><s n="def"/></r></p></e> </pardef>
and then it will change the bidix entry from
<e>
<l>avbrott</l><r>afbrydelse</r>
</e>
to include the autocondord paradigms in the bidix:
<e>
<l>avbrott</l><r>afbrydelse</r>
<par n="_nt_ut"/><par n="_sp_sgpl"/></e>
Variations
Some autocondord paradigms are not really usefull. For example sp-sp and sgpl-sgpl are trivial. You can avoid insertion of these paradigms by appending '/omit to these paradigms in the bidix:
<pardef n="_sgpl_sgpl" c="autoconcord:sgpl-sgpl/omit"> <e> <i></i></e> </pardef> <pardef n="_sp_sp" c="autoconcord:sp-sp/omit"> <e> <i></i></e> </pardef>
If you want to 'inline' a paradigm, that is, have paradims symbols expanded directly in the entry, you add /expand to the autoconcord comment:
<pardef n="_nt_ut" c="autoconcord:nt-ut/expand"> <e> <p><l><s n="nt"/></l><r><s n="ut"/></r></p></e> </pardef>
Then the corrected bidix entry will be:
<e>
<l>avbrott</l><r>afbrydelse</r>
<par n="_nt_ut"/><par n="_sp_sgpl"/></e>
Preparations
Invocation
Usage: apertium-dixtools autoconcord [-prefix symbol(s)] [-replace symbols] [-leftMon mon1.dix] [-rightMon mon1.dix] bidix.dix [output.dix] autoconcord -prepare [-leftMon mon1.dix] [-rightMon mon1.dix] bidix.dix Automatically makes symbols (gender, number, ...) in the bidix agree with the monodices in the cases where the concordance beyound doubt can be resolved automatically. -leftMon and -rightMon specify the monodices file names. If not specified they will be guessed according to default naming schemes -prefix Only concord entries starting with this list of comma-separated symbols. Default: -prefix n -replace Replace (remove) these symbols during processing. Default: m,f,mf,ut,nt,un -prepare attempts to detect and insert autoconcord data into the monodices,
Examples
$ apertium-dixtools autoconcord apertium-sv-da.sv-da.dix
$ apertium-dixtools autoconcord -prefix n -replace ut,nt,un apertium-sv-da.sv-da.dix apertium-sv-da.sv-da.dix.new
$ apertium-dixtools autoconcord -prepare -prefix n -replace m,f,mf,ut,nt,NUMBER:sgpl{sg+pl},NUMBER:sp apertium-sv-da.sv-da.dix
There are also a number of generic options