Difference between revisions of "Apertium-pretransfer"

From Apertium
Jump to navigation Jump to search
Line 1: Line 1:
<code>apertium-pretransfer</code> (installed as part of the <code>apertium</code> package) does certain operations to [[multiwords|multiword]] units before [[bidix]] lookup.
<code>apertium-pretransfer</code> (installed as part of the <code>apertium</code> package) does certain operations to [[multiwords|multiword]] units before [[bidix]] lookup. Input is expected to be disambiguated, and have no surface forms (just analyses).


Compound multiwords (eg. a contraction in [[Romance languages]], with &lt;j/&gt; in the monodix, or compound nominal in [[North Germanic languages]]) are split into two at the + sign:
Compound multiwords (eg. a contraction in [[Romance languages]], with &lt;j/&gt; in the monodix, or compound nominal in [[North Germanic languages]]) are split into two at the + sign:

Revision as of 08:25, 31 January 2011

apertium-pretransfer (installed as part of the apertium package) does certain operations to multiword units before bidix lookup. Input is expected to be disambiguated, and have no surface forms (just analyses).

Compound multiwords (eg. a contraction in Romance languages, with <j/> in the monodix, or compound nominal in North Germanic languages) are split into two at the + sign:

$ echo '^de<pr>+el<det><def><m><sg>$' | apertium-pretransfer 
^de<pr>$ ^el<det><def><m><sg>$
$ echo '^arbeidsmiljø<n><nt><sg><ind><ep-Ø>+lov<n><m><sg><def>$' | apertium-pretransfer 
^arbeidsmiljø<n><nt><sg><ind><ep-Ø>$ ^lov<n><m><sg><def>$

Multiwords with inner inflection (using the <g/> in monodix) get the uninflected part moved (from behind the tags) onto the lemma:

$ echo '^poner<vblex><inf># a prueba$' | apertium-pretransfer 
^poner# a prueba<vblex><inf>$