Difference between revisions of "Multitrans"
Line 1: | Line 1: | ||
'''multitrans''' is a program found in apertium-lex-tools, used as a helper when training (see [[Learning rules from parallel and non-parallel corpora]]). |
'''multitrans''' is a program found in apertium-lex-tools, used as a helper when training (see [[Learning rules from parallel and non-parallel corpora]]). |
||
− | == |
+ | ==modes== |
+ | |||
+ | ===-b=== |
||
+ | This will output the source along with all target translations, like lt-proc -b. |
||
+ | |||
+ | Doing just |
||
<pre> |
<pre> |
||
multitrans sl-tl.autobil.bin -b |
multitrans sl-tl.autobil.bin -b |
||
Line 7: | Line 12: | ||
is equivalent to doing <code>lt-proc -b sl-tl.autobil.bin</code> if the input consists of just correctly formatted lexical units (lt-proc -b fails on some misformattings that multitrans ignores). |
is equivalent to doing <code>lt-proc -b sl-tl.autobil.bin</code> if the input consists of just correctly formatted lexical units (lt-proc -b fails on some misformattings that multitrans ignores). |
||
− | ==-p== |
+ | ===-p=== |
+ | This will output the source side only, so used alone it turns into cat, but used with -t you can trim the tags to what bidix has. |
||
+ | |||
⚫ | |||
<pre> |
<pre> |
||
− | multitrans |
+ | $ echo '^kake<n><f><sg><def>$' |multitrans nno-nob.autobil.bin -p -t |
+ | ^kake<n><f><*>$ |
||
</pre> |
</pre> |
||
+ | |||
⚫ | |||
+ | ===-m=== |
||
+ | This will output one entry on each line with a pair of translations, e.g. |
||
+ | <pre> |
||
+ | $ echo '^obsternasig<adj><pst><sg><ind>$' |multitrans nor-eng.autobil.bin -m |
||
+ | .[][0 0].[] ^obsternasig<adj><pst><sg><ind>/obstinate<adj><pst><sg><ind>$ |
||
+ | .[][0 1].[] ^obsternasig<adj><pst><sg><ind>/obdurate<adj><pst><sg><ind>$ |
||
+ | .[][0 2].[] ^obsternasig<adj><pst><sg><ind>/stubborn<adj><pst><sg><ind>$ |
||
+ | .[][0 3].[] ^obsternasig<adj><pst><sg><ind>/refractory<adj><pst><sg><ind>$ |
||
+ | </pre> |
||
+ | |||
+ | ==Options== |
||
+ | ===-t=== |
||
+ | Trims off tags that don't appear in bidix, e.g. if bidix has an entry for kake<n><f>: |
||
<pre> |
<pre> |
||
$ echo '^kake<n><f><sg><def>$' |multitrans nno-nob.autobil.bin -p -t |
$ echo '^kake<n><f><sg><def>$' |multitrans nno-nob.autobil.bin -p -t |
||
^kake<n><f><*>$ |
^kake<n><f><*>$ |
||
</pre> |
</pre> |
||
+ | |||
+ | ===-f=== |
||
+ | what does this do? |
||
+ | |||
+ | ===-n=== |
||
+ | Numbers the lines. Doesn't seem to make a difference under the -m mode. |
||
Revision as of 12:54, 29 April 2015
multitrans is a program found in apertium-lex-tools, used as a helper when training (see Learning rules from parallel and non-parallel corpora).
modes
-b
This will output the source along with all target translations, like lt-proc -b.
Doing just
multitrans sl-tl.autobil.bin -b
is equivalent to doing lt-proc -b sl-tl.autobil.bin
if the input consists of just correctly formatted lexical units (lt-proc -b fails on some misformattings that multitrans ignores).
-p
This will output the source side only, so used alone it turns into cat, but used with -t you can trim the tags to what bidix has.
So if bidix has an entry for kake<n><f>, you'll get
$ echo '^kake<n><f><sg><def>$' |multitrans nno-nob.autobil.bin -p -t ^kake<n><f><*>$
-m
This will output one entry on each line with a pair of translations, e.g.
$ echo '^obsternasig<adj><pst><sg><ind>$' |multitrans nor-eng.autobil.bin -m .[][0 0].[] ^obsternasig<adj><pst><sg><ind>/obstinate<adj><pst><sg><ind>$ .[][0 1].[] ^obsternasig<adj><pst><sg><ind>/obdurate<adj><pst><sg><ind>$ .[][0 2].[] ^obsternasig<adj><pst><sg><ind>/stubborn<adj><pst><sg><ind>$ .[][0 3].[] ^obsternasig<adj><pst><sg><ind>/refractory<adj><pst><sg><ind>$
Options
-t
Trims off tags that don't appear in bidix, e.g. if bidix has an entry for kake<n><f>:
$ echo '^kake<n><f><sg><def>$' |multitrans nno-nob.autobil.bin -p -t ^kake<n><f><*>$
-f
what does this do?
-n
Numbers the lines. Doesn't seem to make a difference under the -m mode.