Difference between revisions of "Modes"

From Apertium
Jump to navigation Jump to search
Line 15: Line 15:
   
 
<pre>
 
<pre>
LOGSDIR=~/logs/apertium/; SEC=`date +%s`;
+
LOGSDIR=~/logs/apertium/; SEC=`date +%s`;
 
echo "Ara Apertium permet extraure estadístiques" | apertium ca-es-estadistiques
 
echo "Ara Apertium permet extraure estadístiques" | apertium ca-es-estadistiques
 
</pre>
 
</pre>

Revision as of 17:44, 19 January 2008

There are a few ways you can use pipelines in Apertium. One of them is Modes files. Modes files (typically called modes.xml) are XML files which specify which programs should be run and in what order. Normally each linguistic package has one of these files which specifies various ways in which you can use the data to perform translations.


Statistics mode

In order to get some statistical information about translations made using Apertium, we've hacked the main translation mode, pausing the pipeline just after disambiguation and saving the output into a temp file. After that, pipeline is resumed with temp file as stdin.

As an example, you can se the /broken/ pipeline for es-ca, installed as es-ca-estadistiques.mode

/usr/local/bin/lt-proc /usr/local/share/apertium/apertium-es-ca/ca-es.automorf.bin > $LOGSDIR$SEC.tmp;/usr/local/bin/apertium-tagger -g /usr/local/share/apertium/apertium-es-ca/ca-es.prob < $LOGSDIR$SEC.tmp |/usr/local/bin/apertium-pretransfer|/usr/local/bin/apertium-transfer /usr/local/share/apertium/apertium-es-ca/apertium-es-ca.trules-ca-es.xml  /usr/local/share/apertium/apertium-es-ca/trules-ca-es.bin  /usr/local/share/apertium/apertium-es-ca/ca-es.autobil.bin |/usr/local/bin/lt-proc $1 /usr/local/share/apertium/apertium-es-ca/ca-es.autogen.bin |/usr/local/bin/lt-proc -p /usr/local/share/apertium/apertium-es-ca/ca-es.autopgen.bin

And an example of calling apertium with this mode would be the following

LOGSDIR=~/logs/apertium/; SEC=`date +%s`;
echo "Ara Apertium permet extraure estadístiques" | apertium ca-es-estadistiques

In that example, $LOGSDIR is a folder where the logs will be saved, and $SEC is an unique ID for that log.

When translation is done, we can process the log created in order to get statistics.


See also