Difference between revisions of "Apertium-quality/Application Documentation"

From Apertium
Jump to navigation Jump to search
 
Line 1: Line 1:
== aq-ambtest ==
+
== Usage Information ==
  +
=== aq-ambtest — Ambiguity Testing ===
<span style='font-weight: bold; font-size: 12pt;'>Help output</span>
 
  +
==== Help output ====
 
<pre>
 
<pre>
 
usage: aq-ambtest [-h] [-X [STATFILE]] dictionary
 
usage: aq-ambtest [-h] [-X [STATFILE]] dictionary
Line 15: Line 16:
   
 
</pre>
 
</pre>
  +
==== Usage ====
<span style='font-weight: bold; font-size: 12pt;'>Usage</span>
 
 
<pre>aq-ambtest apertium-mt-he.mt.dix -X
 
<pre>aq-ambtest apertium-mt-he.mt.dix -X
 
aq-ambtest apertium-mt-he.he.dix -X</pre>
 
aq-ambtest apertium-mt-he.he.dix -X</pre>
   
== aq-wikicrp ==
+
=== aq-wikicrp &mdash; Wikipedia Corpus Extractor ===
  +
==== Help output ====
<span style='font-weight: bold; font-size: 12pt;'>Help output</span>
 
 
<pre>
 
<pre>
 
usage: aq-wikicrp [-h] [-c COUNT] [-C CORES] [-t TOKENISER] [-q QUEUE] [-x]
 
usage: aq-wikicrp [-h] [-c COUNT] [-C CORES] [-t TOKENISER] [-q QUEUE] [-x]
Line 45: Line 46:
   
 
</pre>
 
</pre>
  +
==== Usage ====
<span style='font-weight: bold; font-size: 12pt;'>Usage</span>
 
<pre>wget http://download.wikimedia.org/mtwiki/20110617/mtwiki-20110617-pages-articles.xml.bz2 && bunzip2 mtwiki-20110617-pages-articles.xml.bz2
+
<pre>wget http://dumps.wikimedia.org/mtwiki/latest/mtwiki-latest-pages-articles.xml.bz2 && bunzip2 mtwiki-latest-pages-articles.xml.bz2
aq-wikicrp mtwiki-20110617-pages-articles.xml mt.wikipedia.crp.txt</pre>
+
aq-wikicrp mtwiki-latest-pages-articles.xml mt.wikipedia.crp.txt</pre>
   
== aq-covtest ==
+
=== aq-covtest &mdash; Coverage Testing ===
  +
==== Help output ====
<span style='font-weight: bold; font-size: 12pt;'>Help output</span>
 
 
<pre>
 
<pre>
 
usage: aq-covtest [-h] [-X [STATFILE]] [-H] corpus dictionary
 
usage: aq-covtest [-h] [-X [STATFILE]] [-H] corpus dictionary
Line 67: Line 68:
   
 
</pre>
 
</pre>
  +
==== Usage ====
<span style='font-weight: bold; font-size: 12pt;'>Usage</span>
 
 
<pre>aq-covtest mt.news.crp.txt mt-he.automorf.bin -X</pre>
 
<pre>aq-covtest mt.news.crp.txt mt-he.automorf.bin -X</pre>
   
== aq-htmlgen ==
+
=== aq-htmlgen &mdash; HTML Generation for Statistics ===
  +
==== Help output ====
<span style='font-weight: bold; font-size: 12pt;'>Help output</span>
 
 
<pre>
 
<pre>
 
usage: aq-htmlgen [-h] [-t [TITLE]] statistics outdir
 
usage: aq-htmlgen [-h] [-t [TITLE]] statistics outdir
Line 87: Line 88:
   
 
</pre>
 
</pre>
  +
==== Usage ====
<span style='font-weight: bold; font-size: 12pt;'>Usage</span>
 
 
<pre>aq-htmlgen quality-stats.xml output</pre>
 
<pre>aq-htmlgen quality-stats.xml output</pre>
   
== aq-autotest ==
+
=== aq-autotest &mdash; Automatic Testing using AQX files ===
  +
==== Help output ====
<span style='font-weight: bold; font-size: 12pt;'>Help output</span>
 
 
<pre>
 
<pre>
 
usage: aq-autotest [-h] [-c] [-X [STATS]] [-o [OUTDIR]] aqx
 
usage: aq-autotest [-h] [-c] [-X [STATS]] [-o [OUTDIR]] aqx
Line 109: Line 110:
   
 
</pre>
 
</pre>
  +
==== Usage ====
<span style='font-weight: bold; font-size: 12pt;'>Usage</span>
 
 
<pre>aq-autotest -X stats.xml -o output quality.aqx</pre>
 
<pre>aq-autotest -X stats.xml -o output quality.aqx</pre>
   
== aq-gentest ==
+
=== aq-gentest &mdash; Generation Testing ===
  +
==== Help output ====
<span style='font-weight: bold; font-size: 12pt;'>Help output</span>
 
 
<pre>
 
<pre>
 
usage: aq-gentest [-h] [-X [STATFILE]] [-d [DIRECTORY]] mode corpus
 
usage: aq-gentest [-h] [-X [STATFILE]] [-d [DIRECTORY]] mode corpus
Line 131: Line 132:
   
 
</pre>
 
</pre>
  +
==== Usage ====
<span style='font-weight: bold; font-size: 12pt;'>Usage</span>
 
 
<pre>aq-gentest -d . mt-he mt.crp.txt -X</pre>
 
<pre>aq-gentest -d . mt-he mt.crp.txt -X</pre>
   
== aq-dixtest ==
+
=== aq-dixtest &mdash; Dictionary tests (rule count, etc) ===
  +
==== Help output ====
<span style='font-weight: bold; font-size: 12pt;'>Help output</span>
 
 
<pre>
 
<pre>
 
usage: aq-dixtest [-h] [-X [STATFILE]] [-d [DICTDIR]] langpair
 
usage: aq-dixtest [-h] [-X [STATFILE]] [-d [DICTDIR]] langpair
Line 152: Line 153:
   
 
</pre>
 
</pre>
  +
==== Usage ====
<span style='font-weight: bold; font-size: 12pt;'>Usage</span>
 
 
<pre>aq-dixtest -d . mt-he </pre>
 
<pre>aq-dixtest -d . mt-he </pre>
   
== aq-regtest ==
+
=== aq-regtest &mdash; Regression Testing ===
  +
==== Help output ====
<span style='font-weight: bold; font-size: 12pt;'>Help output</span>
 
 
<pre>
 
<pre>
 
usage: aq-regtest [-h] [-X [STATFILE]] [-d [DICTDIR]] mode wikiurl
 
usage: aq-regtest [-h] [-X [STATFILE]] [-d [DICTDIR]] mode wikiurl
Line 174: Line 175:
   
 
</pre>
 
</pre>
  +
==== Usage ====
<span style='font-weight: bold; font-size: 12pt;'>Usage</span>
 
 
<pre>aq-regtest -d . mt-he http://wiki.apertium.org/wiki/Special:Export/Maltese_and_Hebrew/Regression_tests -X
 
<pre>aq-regtest -d . mt-he http://wiki.apertium.org/wiki/Special:Export/Maltese_and_Hebrew/Regression_tests -X
 
aq-regtest -d . mt-he Regression_tests.xml -X</pre>
 
aq-regtest -d . mt-he Regression_tests.xml -X</pre>
   
== aq-voctest ==
+
=== aq-voctest &mdash; Vocabulary Testing ===
  +
==== Help output ====
<span style='font-weight: bold; font-size: 12pt;'>Help output</span>
 
 
<pre>
 
<pre>
 
usage: aq-voctest [-h] [-X [STATFILE]] [-a [ANADIX]] [-g [GENBIN]]
 
usage: aq-voctest [-h] [-X [STATFILE]] [-a [ANADIX]] [-g [GENBIN]]
Line 204: Line 205:
 
Dictionary direction (lr, rl)
 
Dictionary direction (lr, rl)
 
-o [OUTPUT], --output [OUTPUT]
 
-o [OUTPUT], --output [OUTPUT]
Output file for arrows output
+
Output file for arrows output (Default: voctest.txt)
   
 
</pre>
 
</pre>
  +
==== Usage ====
<span style='font-weight: bold; font-size: 12pt;'>Usage</span>
 
  +
<pre>aq-voctest mt-he -X</pre>
<pre>aq-regtest -d . mt-he http://wiki.apertium.org/wiki/Special:Export/Maltese_and_Hebrew/Regression_tests -X</pre>
 
   
== aq-morftest ==
+
=== aq-morftest &mdash; Morph Testing (HFST, etc) ===
  +
==== Help output ====
<span style='font-weight: bold; font-size: 12pt;'>Help output</span>
 
 
<pre>
 
<pre>
 
usage: aq-morftest [-h] [-c] [-X [STATFILE]] [-C] [-i] [-s] [-l] [-f] [-p]
 
usage: aq-morftest [-h] [-c] [-X [STATFILE]] [-C] [-i] [-s] [-l] [-f] [-p]
Line 250: Line 251:
   
 
</pre>
 
</pre>
  +
==== Usage ====
<span style='font-weight: bold; font-size: 12pt;'>Usage</span>
 
 
<pre>aq-morftest tgl.yaml -X</pre>
 
<pre>aq-morftest tgl.yaml -X</pre>

Latest revision as of 11:02, 30 August 2011

Usage Information[edit]

aq-ambtest — Ambiguity Testing[edit]

Help output[edit]

usage: aq-ambtest [-h] [-X [STATFILE]] dictionary

Get average ambiguity.

positional arguments:
  dictionary            DIX file

optional arguments:
  -h, --help            show this help message and exit
  -X [STATFILE], --statistics [STATFILE]
                        XML file that statistics are to be stored in (Default: quality-stats.xml)

Usage[edit]

aq-ambtest apertium-mt-he.mt.dix -X
aq-ambtest apertium-mt-he.he.dix -X

aq-wikicrp — Wikipedia Corpus Extractor[edit]

Help output[edit]

usage: aq-wikicrp [-h] [-c COUNT] [-C CORES] [-t TOKENISER] [-q QUEUE] [-x]
                  wikidump outfile

Extract a usable corpus from a Wikipedia dump.

positional arguments:
  wikidump              Wikipedia XML dump
  outfile               Output filename

optional arguments:
  -h, --help            show this help message and exit
  -c COUNT, --count COUNT
                        Maximum sentences to store in corpus output (default:
                        unlimited)
  -C CORES, --cores CORES
                        Limit how many cores to use for generation
  -t TOKENISER, --tokeniser TOKENISER
                        Tokeniser to use
  -q QUEUE, --queue QUEUE
                        Set queue size (for advanced users)
  -x, --xml             Output corpora in XML format

Usage[edit]

wget http://dumps.wikimedia.org/mtwiki/latest/mtwiki-latest-pages-articles.xml.bz2 && bunzip2 mtwiki-latest-pages-articles.xml.bz2
aq-wikicrp mtwiki-latest-pages-articles.xml mt.wikipedia.crp.txt

aq-covtest — Coverage Testing[edit]

Help output[edit]

usage: aq-covtest [-h] [-X [STATFILE]] [-H] corpus dictionary

Test coverage.

positional arguments:
  corpus                Corpus text file
  dictionary            Binary dictionary (.bin, .fst, etc)

optional arguments:
  -h, --help            show this help message and exit
  -X [STATFILE], --statistics [STATFILE]
                        XML file that statistics are to be stored in (Default: quality-stats.xml)
  -H, --hfst            HFST mode

Usage[edit]

aq-covtest mt.news.crp.txt mt-he.automorf.bin -X

aq-htmlgen — HTML Generation for Statistics[edit]

Help output[edit]

usage: aq-htmlgen [-h] [-t [TITLE]] statistics outdir

Generate webpage and related files.

positional arguments:
  statistics            Statistics file
  outdir                Output directory

optional arguments:
  -h, --help            show this help message and exit
  -t [TITLE], --title [TITLE]
                        Directory of dictionary (Default: current directory)

Usage[edit]

aq-htmlgen quality-stats.xml output

aq-autotest — Automatic Testing using AQX files[edit]

Help output[edit]

usage: aq-autotest [-h] [-c] [-X [STATS]] [-o [OUTDIR]] aqx

Attempt all tests with default settings.

positional arguments:
  aqx                   Apertium Quality XML configuration file

optional arguments:
  -h, --help            show this help message and exit
  -c, --colour          Colours the output
  -X [STATS], --statistics [STATS]
                        XML file that statistics are to be stored in (Default: quality-stats.xml)
  -o [OUTDIR], --html [OUTDIR]
                        Output directory for HTML content

Usage[edit]

aq-autotest -X stats.xml -o output quality.aqx

aq-gentest — Generation Testing[edit]

Help output[edit]

usage: aq-gentest [-h] [-X [STATFILE]] [-d [DIRECTORY]] mode corpus

Test generation.

positional arguments:
  mode                  Language mode (eg, br-fr)
  corpus                Corpus text file

optional arguments:
  -h, --help            show this help message and exit
  -X [STATFILE], --statistics [STATFILE]
                        XML file that statistics are to be stored in (Default: quality-stats.xml)
  -d [DIRECTORY], --dict [DIRECTORY]
                        Directory of dictionary (Default: current directory)

Usage[edit]

aq-gentest -d . mt-he mt.crp.txt -X

aq-dixtest — Dictionary tests (rule count, etc)[edit]

Help output[edit]

usage: aq-dixtest [-h] [-X [STATFILE]] [-d [DICTDIR]] langpair

Get general dictionary statistics.

positional arguments:
  langpair              Language pair (eg aa-ab)

optional arguments:
  -h, --help            show this help message and exit
  -X [STATFILE], --statistics [STATFILE]
                        XML file that statistics are to be stored in (Default: quality-stats.xml)
  -d [DICTDIR], --dict [DICTDIR]
                        Directory of dictionary (Default: current directory)

Usage[edit]

aq-dixtest -d . mt-he 

aq-regtest — Regression Testing[edit]

Help output[edit]

usage: aq-regtest [-h] [-X [STATFILE]] [-d [DICTDIR]] mode wikiurl

Test for regressions directly from Apertium wiki.

positional arguments:
  mode                  Mode of operation (eg. br-fr)
  wikiurl               URL to regression tests

optional arguments:
  -h, --help            show this help message and exit
  -X [STATFILE], --statistics [STATFILE]
                        XML file that statistics are to be stored in (Default: quality-stats.xml)
  -d [DICTDIR], --dict [DICTDIR]
                        Directory of dictionary (Default: current directory)

Usage[edit]

aq-regtest -d . mt-he http://wiki.apertium.org/wiki/Special:Export/Maltese_and_Hebrew/Regression_tests -X
aq-regtest -d . mt-he Regression_tests.xml -X

aq-voctest — Vocabulary Testing[edit]

Help output[edit]

usage: aq-voctest [-h] [-X [STATFILE]] [-a [ANADIX]] [-g [GENBIN]]
                  [-d [DICTDIR]] [-D [DIRECTION]] [-o [OUTPUT]]
                  langpair

Test vocabulary for generation errors.

positional arguments:
  langpair              Language pair (eg, br-fr)

optional arguments:
  -h, --help            show this help message and exit
  -X [STATFILE], --statistics [STATFILE]
                        XML file that statistics are to be stored in (Default: quality-stats.xml)
  -a [ANADIX], --anadix [ANADIX]
                        Analysis .dix file (Default: apertium-aa-ab.aa-ab.dix)
  -g [GENBIN], --genbin [GENBIN]
                        Generation .bin file (Default: apertium-aa-
                        ab.autogen.bin)
  -d [DICTDIR], --dict [DICTDIR]
                        Directory of dictionary (Default: current directory)
  -D [DIRECTION], --direction [DIRECTION]
                        Dictionary direction (lr, rl)
  -o [OUTPUT], --output [OUTPUT]
                        Output file for arrows output (Default: voctest.txt)

Usage[edit]

aq-voctest mt-he -X

aq-morftest — Morph Testing (HFST, etc)[edit]

Help output[edit]

usage: aq-morftest [-h] [-c] [-X [STATFILE]] [-C] [-i] [-s] [-l] [-f] [-p]
                   [-S SECTION] [-t TEST] [-v] [--app APP] [--gen GEN]
                   [--morph MORPH]
                   test_file

Test morphological transducers for consistency. `hfst-lookup` (or Xerox'
`lookup` with argument -x) must be available on the PATH.

positional arguments:
  test_file             YAML file with test rules

optional arguments:
  -h, --help            show this help message and exit
  -c, --colour          Colours the output
  -X [STATFILE], --statistics [STATFILE]
                        XML file that statistics are to be stored in (Default: quality-stats.xml)
  -C, --compact         Makes output more compact
  -i, --ignore-extra-analyses
                        Ignore extra analyses when there are more than
                        expected, will PASS if the expected one is found.
  -s, --surface         Surface input/analysis tests only
  -l, --lexical         Lexical input/generation tests only
  -f, --hide-fails      Suppresses passes to make finding failures easier
  -p, --hide-passes     Suppresses failures to make finding passes easier
  -S SECTION, --section SECTION
                        The section to be used for testing (default is `hfst`)
  -t TEST, --test TEST  Which test to run (Default: all). TEST = test ID, e.g.
                        'Noun - gåetie' (remember quotes if the ID contains
                        spaces)
  -v, --verbose         More verbose output.
  --app APP             Override application used for test
  --gen GEN             Override generation transducer used for test
  --morph MORPH         Override morph transducer used for test

Will run all tests in the test_file by default.

Usage[edit]

aq-morftest tgl.yaml -X