Difference between revisions of "Postgenerator"

From Apertium
Jump to navigation Jump to search
(Created page with "Sometimes you want to be able to merge two tokens in output, for example for contractions, e.g. ''de'' + ''el'' = ''del''. You can do this using the postgenerator. First mak...")
 
Line 62: Line 62:
   
 
<pre>
 
<pre>
  +
...
 
  +
<program name="lt-proc $1">
  +
<file name="aaa-bbb.autogen.bin"/>
  +
</program>
  +
<program name="lt-proc -p">
  +
<file name="aaa-bbb.autopgen.bin"/>
  +
</program>
  +
...
   
 
</pre>
 
</pre>

Revision as of 15:46, 24 August 2019

Sometimes you want to be able to merge two tokens in output, for example for contractions, e.g. de + el = del.

You can do this using the postgenerator.

First make sure you add the postgenerator wakeup symbol to your monolingual dictionary, e.g. apertium-aaa.aaa.dix

apertium-aaa.aaa.dix:

   <pardef n="/de__pr">
     <e r="LR"><p><l>de</l><r>de<s n="pr"/></r></p></e>
     <e r="RL"><p><l><a/>de</l><r>de<s n="pr"/></r></p></e>
   </pardef>

...

   <e lm="de"><i></i><par n="/de__pr"/></e>

...

You should get entries like:

de:>:de<pr>
~de:<:de<pr>

from lt-expand apertium-aaa.aaa.dix. apertium-aaa.post-aaa.dix:


<?xml version="1.0" encoding="UTF-8"?>
<dictionary>
  <alphabet/>
  <sdefs>
    <sdef n="test"/>
  </sdefs>
  <section id="main" type="standard">

     <e> <p><l><a/>de<b/>el</l><r>del</r></p></e>
  </section>
</dictionary>

You can compile it like:

$ lt-comp lr apertium-aaa.post-aaa.dix aaa.autopgen.bin
main@standard 7 6

And use it like:

$ echo "~de el" | lt-proc -p aaa.autopgen.bin 
del

In your modes file:

...
      <program name="lt-proc $1">
        <file name="aaa-bbb.autogen.bin"/>
      </program>
      <program name="lt-proc -p">
        <file name="aaa-bbb.autopgen.bin"/>
      </program>
...