Difference between revisions of "Multitrans"
| Naan Dhaan (talk | contribs) m (→-b) | Naan Dhaan (talk | contribs)  m | ||
| Line 17: | Line 17: | ||
| So if bidix has an entry for kake<n><f>, you'll get | So if bidix has an entry for kake<n><f>, you'll get | ||
| <pre> | <pre> | ||
| $ echo '^kake<n><f><sg><def>$' |multitrans nno-nob.autobil.bin | $ echo '^kake<n><f><sg><def>$' |multitrans -p -t nno-nob.autobil.bin | ||
| ^kake<n><f><*>$ | ^kake<n><f><*>$ | ||
| </pre> | </pre> | ||
| Line 24: | Line 24: | ||
| This will output one entry on each line with a pair of translations, e.g.  | This will output one entry on each line with a pair of translations, e.g.  | ||
| <pre> | <pre> | ||
| $ echo '^obsternasig<adj><pst><sg><ind>$' |multitrans nor-eng.autobil.bin | $ echo '^obsternasig<adj><pst><sg><ind>$' |multitrans -m nor-eng.autobil.bin | ||
| .[][0 0].[]     ^obsternasig<adj><pst><sg><ind>/obstinate<adj><pst><sg><ind>$ | .[][0 0].[]     ^obsternasig<adj><pst><sg><ind>/obstinate<adj><pst><sg><ind>$ | ||
| .[][0 1].[]     ^obsternasig<adj><pst><sg><ind>/obdurate<adj><pst><sg><ind>$ | .[][0 1].[]     ^obsternasig<adj><pst><sg><ind>/obdurate<adj><pst><sg><ind>$ | ||
| Line 35: | Line 35: | ||
| Trims off tags that don't appear in bidix, e.g. if bidix has an entry for kake<n><f>: | Trims off tags that don't appear in bidix, e.g. if bidix has an entry for kake<n><f>: | ||
| <pre> | <pre> | ||
| $ echo '^kake<n><f><sg><def>$' |multitrans nno-nob.autobil.bin | $ echo '^kake<n><f><sg><def>$' |multitrans -p -t nno-nob.autobil.bin | ||
| ^kake<n><f><*>$ | ^kake<n><f><*>$ | ||
| </pre> | </pre> | ||
| Line 41: | Line 41: | ||
| Can be used with -m or -b as well: | Can be used with -m or -b as well: | ||
| <pre> | <pre> | ||
| $ echo '^obsternasig<adj><pst><sg><ind>$' |multitrans nor-eng.autobil.bin | $ echo '^obsternasig<adj><pst><sg><ind>$' |multitrans -m -t nor-eng.autobil.bin | ||
| .[][0 0].[]     ^obsternasig<adj><*>/obstinate<adj><*>$ | .[][0 0].[]     ^obsternasig<adj><*>/obstinate<adj><*>$ | ||
| .[][0 1].[]     ^obsternasig<adj><*>/obdurate<adj><*>$ | .[][0 1].[]     ^obsternasig<adj><*>/obdurate<adj><*>$ | ||
| Line 47: | Line 47: | ||
| .[][0 3].[]     ^obsternasig<adj><*>/refractory<adj><*>$ | .[][0 3].[]     ^obsternasig<adj><*>/refractory<adj><*>$ | ||
| $ echo '^obsternasig<adj><pst><sg><ind>$' |multitrans nor-eng.autobil.bin | $ echo '^obsternasig<adj><pst><sg><ind>$' |multitrans -b -t nor-eng.autobil.bin | ||
| ^obsternasig<adj><*>/obstinate<adj><*>/obdurate<adj><*>/stubborn<adj><*>/refractory<adj><*>$ | ^obsternasig<adj><*>/obstinate<adj><*>/obdurate<adj><*>/stubborn<adj><*>/refractory<adj><*>$ | ||
| </pre> | </pre> | ||
Revision as of 10:21, 9 May 2021
multitrans is a program found in apertium-lex-tools, used as a helper when training (see Learning rules from parallel and non-parallel corpora).
modes
-b
This will output the source along with all target translations, like lt-proc -b.
Doing just
multitrans -b sl-tl.autobil.bin
is equivalent to doing lt-proc -b sl-tl.autobil.bin if the input consists of just correctly formatted lexical units (lt-proc -b fails on some misformattings that multitrans ignores).
-p
This will output the source side only, so used alone it turns into cat, but used with -t you can trim the tags to what bidix has.
So if bidix has an entry for kake<n><f>, you'll get
$ echo '^kake<n><f><sg><def>$' |multitrans -p -t nno-nob.autobil.bin ^kake<n><f><*>$
-m
This will output one entry on each line with a pair of translations, e.g.
$ echo '^obsternasig<adj><pst><sg><ind>$' |multitrans -m nor-eng.autobil.bin .[][0 0].[] ^obsternasig<adj><pst><sg><ind>/obstinate<adj><pst><sg><ind>$ .[][0 1].[] ^obsternasig<adj><pst><sg><ind>/obdurate<adj><pst><sg><ind>$ .[][0 2].[] ^obsternasig<adj><pst><sg><ind>/stubborn<adj><pst><sg><ind>$ .[][0 3].[] ^obsternasig<adj><pst><sg><ind>/refractory<adj><pst><sg><ind>$
Options
-t
Trims off tags that don't appear in bidix, e.g. if bidix has an entry for kake<n><f>:
$ echo '^kake<n><f><sg><def>$' |multitrans -p -t nno-nob.autobil.bin ^kake<n><f><*>$
Can be used with -m or -b as well:
$ echo '^obsternasig<adj><pst><sg><ind>$' |multitrans -m -t nor-eng.autobil.bin .[][0 0].[] ^obsternasig<adj><*>/obstinate<adj><*>$ .[][0 1].[] ^obsternasig<adj><*>/obdurate<adj><*>$ .[][0 2].[] ^obsternasig<adj><*>/stubborn<adj><*>$ .[][0 3].[] ^obsternasig<adj><*>/refractory<adj><*>$ $ echo '^obsternasig<adj><pst><sg><ind>$' |multitrans -b -t nor-eng.autobil.bin ^obsternasig<adj><*>/obstinate<adj><*>/obdurate<adj><*>/stubborn<adj><*>/refractory<adj><*>$
-f
what does this do?
-n
Numbers the lines. Doesn't seem to make a difference under the -m mode.

