Post-generator
Jump to navigation
Jump to search
Many languages use a post-generator FST to fix minor orthographical issues. This FST is in lttoolbox format and is run by lt-proc
with the -p
or --post-generation
switch. An example of such an orthographical issue is the "a" vs "an" difference in English. The english generator will output ~a
, and the post-generation FST changes that to a or an depending on the following word.
The source dictionary is typically named something like apertium-cat.post-cat.dix
, while the compiled file gets a name like spa-cat.autopgen.bin
.
Here's a minimal example for turning ~a into an before vowels:
<?xml version="1.0" encoding="UTF-8"?> <dictionary> <alphabet/> <sdefs> <sdef n="n" c="Noun"/> </sdefs> <pardefs> <pardef n="vocals"> <e> <i>a</i> </e> <e> <i>e</i> </e> <e> <i>i</i> </e> <e> <i>o</i> </e> <e> <i>u</i> </e> </pardef> </pardefs> <section id="main" type="standard"> <e> <p> <l><a/>a<b/></l> <r>an<b/></r> </p> <par n="vocals"/> </e> </section> </dictionary>