Difference between revisions of "Testvoc"

From Apertium
Jump to navigation Jump to search
Line 1: Line 1:
A '''testvoc''' is literally a test of vocabulary. At the most basic level, it just expands an {{sc|sl}} dictionary, and runs each possibly analysed [[lexical form]] through all the translation stages to see that for each possible input, a sensible translation in the {{sc|tl}}, without <code>#</code>, or <code>@</code> symbols is generated.
A '''testvoc''' is literally a test of vocabulary. At the most basic level, it just expands an {{sc|sl}} dictionary, and runs each possibly analysed [[lexical form]] through all the translation stages to see that for each possible input, a sensible translation in the {{sc|tl}}, without <code>#</code>, or <code>@</code> symbols is generated.

==Example script==

This is an example <code>inconsistency.sh</code> script from <code>apertium-br-fr</code> that expands the dictionary of Breton and passes it through the translator.


<pre>
<pre>
TMPDIR=/tmp
TMPDIR=/tmp


lt-expand ../apertium-br-fr.br.dix | grep -v '<prn><enc>' | grep -e ':<:' -e '\w:\w' | sed 's/:<:/%/g' | sed 's/:/%/g' | cut -f2 -d'%' | sed 's/^/^/g' | sed 's/$/$ ^.<sent>$/g' | tee $TMPDIR/tmp_testvoc1.txt |\
lt-expand ../apertium-br-fr.br.dix | grep -v '<prn><enc>' | grep -e ':<:' -e '\w:\w' |\
sed 's/:<:/%/g' | sed 's/:/%/g' | cut -f2 -d'%' | sed 's/^/^/g' | sed 's/$/$ ^.<sent>$/g' |\
tee $TMPDIR/tmp_testvoc1.txt |\
apertium-pretransfer|\
apertium-pretransfer|\
apertium-transfer ../apertium-br-fr.br-fr.t1x ../br-fr.t1x.bin ../br-fr.autobil.bin |\
apertium-transfer ../apertium-br-fr.br-fr.t1x ../br-fr.t1x.bin ../br-fr.autobil.bin |\

Revision as of 14:27, 7 December 2009

A testvoc is literally a test of vocabulary. At the most basic level, it just expands an sl dictionary, and runs each possibly analysed lexical form through all the translation stages to see that for each possible input, a sensible translation in the tl, without #, or @ symbols is generated.

Example script

This is an example inconsistency.sh script from apertium-br-fr that expands the dictionary of Breton and passes it through the translator.

TMPDIR=/tmp

lt-expand ../apertium-br-fr.br.dix | grep -v '<prn><enc>' | grep -e ':<:' -e '\w:\w' |\
 sed 's/:<:/%/g' | sed 's/:/%/g' | cut -f2 -d'%' |  sed 's/^/^/g' | sed 's/$/$ ^.<sent>$/g' |\
 tee $TMPDIR/tmp_testvoc1.txt |\
        apertium-pretransfer|\
        apertium-transfer ../apertium-br-fr.br-fr.t1x  ../br-fr.t1x.bin  ../br-fr.autobil.bin |\
        apertium-interchunk ../apertium-br-fr.br-fr.t2x  ../br-fr.t2x.bin |\
        apertium-postchunk ../apertium-br-fr.br-fr.t3x  ../br-fr.t3x.bin  | tee $TMPDIR/tmp_testvoc2.txt |\
        lt-proc -d ../br-fr.autogen.bin > $TMPDIR/tmp_testvoc3.txt
paste -d _ $TMPDIR/tmp_testvoc1.txt $TMPDIR/tmp_testvoc2.txt $TMPDIR/tmp_testvoc3.txt | sed 's/\^.<sent>\$//g' | sed 's/_/   --------->  /g'