Apertium-quality/Application Documentation

From Apertium
< Apertium-quality
Revision as of 11:02, 30 August 2011 by 123.243.206.102 (talk)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Usage Information[edit]

aq-ambtest — Ambiguity Testing[edit]

Help output[edit]

usage: aq-ambtest [-h] [-X [STATFILE]] dictionary

Get average ambiguity.

positional arguments:
  dictionary            DIX file

optional arguments:
  -h, --help            show this help message and exit
  -X [STATFILE], --statistics [STATFILE]
                        XML file that statistics are to be stored in (Default: quality-stats.xml)

Usage[edit]

aq-ambtest apertium-mt-he.mt.dix -X
aq-ambtest apertium-mt-he.he.dix -X

aq-wikicrp — Wikipedia Corpus Extractor[edit]

Help output[edit]

usage: aq-wikicrp [-h] [-c COUNT] [-C CORES] [-t TOKENISER] [-q QUEUE] [-x]
                  wikidump outfile

Extract a usable corpus from a Wikipedia dump.

positional arguments:
  wikidump              Wikipedia XML dump
  outfile               Output filename

optional arguments:
  -h, --help            show this help message and exit
  -c COUNT, --count COUNT
                        Maximum sentences to store in corpus output (default:
                        unlimited)
  -C CORES, --cores CORES
                        Limit how many cores to use for generation
  -t TOKENISER, --tokeniser TOKENISER
                        Tokeniser to use
  -q QUEUE, --queue QUEUE
                        Set queue size (for advanced users)
  -x, --xml             Output corpora in XML format

Usage[edit]

wget http://dumps.wikimedia.org/mtwiki/latest/mtwiki-latest-pages-articles.xml.bz2 && bunzip2 mtwiki-latest-pages-articles.xml.bz2
aq-wikicrp mtwiki-latest-pages-articles.xml mt.wikipedia.crp.txt

aq-covtest — Coverage Testing[edit]

Help output[edit]

usage: aq-covtest [-h] [-X [STATFILE]] [-H] corpus dictionary

Test coverage.

positional arguments:
  corpus                Corpus text file
  dictionary            Binary dictionary (.bin, .fst, etc)

optional arguments:
  -h, --help            show this help message and exit
  -X [STATFILE], --statistics [STATFILE]
                        XML file that statistics are to be stored in (Default: quality-stats.xml)
  -H, --hfst            HFST mode

Usage[edit]

aq-covtest mt.news.crp.txt mt-he.automorf.bin -X

aq-htmlgen — HTML Generation for Statistics[edit]

Help output[edit]

usage: aq-htmlgen [-h] [-t [TITLE]] statistics outdir

Generate webpage and related files.

positional arguments:
  statistics            Statistics file
  outdir                Output directory

optional arguments:
  -h, --help            show this help message and exit
  -t [TITLE], --title [TITLE]
                        Directory of dictionary (Default: current directory)

Usage[edit]

aq-htmlgen quality-stats.xml output

aq-autotest — Automatic Testing using AQX files[edit]

Help output[edit]

usage: aq-autotest [-h] [-c] [-X [STATS]] [-o [OUTDIR]] aqx

Attempt all tests with default settings.

positional arguments:
  aqx                   Apertium Quality XML configuration file

optional arguments:
  -h, --help            show this help message and exit
  -c, --colour          Colours the output
  -X [STATS], --statistics [STATS]
                        XML file that statistics are to be stored in (Default: quality-stats.xml)
  -o [OUTDIR], --html [OUTDIR]
                        Output directory for HTML content

Usage[edit]

aq-autotest -X stats.xml -o output quality.aqx

aq-gentest — Generation Testing[edit]

Help output[edit]

usage: aq-gentest [-h] [-X [STATFILE]] [-d [DIRECTORY]] mode corpus

Test generation.

positional arguments:
  mode                  Language mode (eg, br-fr)
  corpus                Corpus text file

optional arguments:
  -h, --help            show this help message and exit
  -X [STATFILE], --statistics [STATFILE]
                        XML file that statistics are to be stored in (Default: quality-stats.xml)
  -d [DIRECTORY], --dict [DIRECTORY]
                        Directory of dictionary (Default: current directory)

Usage[edit]

aq-gentest -d . mt-he mt.crp.txt -X

aq-dixtest — Dictionary tests (rule count, etc)[edit]

Help output[edit]

usage: aq-dixtest [-h] [-X [STATFILE]] [-d [DICTDIR]] langpair

Get general dictionary statistics.

positional arguments:
  langpair              Language pair (eg aa-ab)

optional arguments:
  -h, --help            show this help message and exit
  -X [STATFILE], --statistics [STATFILE]
                        XML file that statistics are to be stored in (Default: quality-stats.xml)
  -d [DICTDIR], --dict [DICTDIR]
                        Directory of dictionary (Default: current directory)

Usage[edit]

aq-dixtest -d . mt-he 

aq-regtest — Regression Testing[edit]

Help output[edit]

usage: aq-regtest [-h] [-X [STATFILE]] [-d [DICTDIR]] mode wikiurl

Test for regressions directly from Apertium wiki.

positional arguments:
  mode                  Mode of operation (eg. br-fr)
  wikiurl               URL to regression tests

optional arguments:
  -h, --help            show this help message and exit
  -X [STATFILE], --statistics [STATFILE]
                        XML file that statistics are to be stored in (Default: quality-stats.xml)
  -d [DICTDIR], --dict [DICTDIR]
                        Directory of dictionary (Default: current directory)

Usage[edit]

aq-regtest -d . mt-he http://wiki.apertium.org/wiki/Special:Export/Maltese_and_Hebrew/Regression_tests -X
aq-regtest -d . mt-he Regression_tests.xml -X

aq-voctest — Vocabulary Testing[edit]

Help output[edit]

usage: aq-voctest [-h] [-X [STATFILE]] [-a [ANADIX]] [-g [GENBIN]]
                  [-d [DICTDIR]] [-D [DIRECTION]] [-o [OUTPUT]]
                  langpair

Test vocabulary for generation errors.

positional arguments:
  langpair              Language pair (eg, br-fr)

optional arguments:
  -h, --help            show this help message and exit
  -X [STATFILE], --statistics [STATFILE]
                        XML file that statistics are to be stored in (Default: quality-stats.xml)
  -a [ANADIX], --anadix [ANADIX]
                        Analysis .dix file (Default: apertium-aa-ab.aa-ab.dix)
  -g [GENBIN], --genbin [GENBIN]
                        Generation .bin file (Default: apertium-aa-
                        ab.autogen.bin)
  -d [DICTDIR], --dict [DICTDIR]
                        Directory of dictionary (Default: current directory)
  -D [DIRECTION], --direction [DIRECTION]
                        Dictionary direction (lr, rl)
  -o [OUTPUT], --output [OUTPUT]
                        Output file for arrows output (Default: voctest.txt)

Usage[edit]

aq-voctest mt-he -X

aq-morftest — Morph Testing (HFST, etc)[edit]

Help output[edit]

usage: aq-morftest [-h] [-c] [-X [STATFILE]] [-C] [-i] [-s] [-l] [-f] [-p]
                   [-S SECTION] [-t TEST] [-v] [--app APP] [--gen GEN]
                   [--morph MORPH]
                   test_file

Test morphological transducers for consistency. `hfst-lookup` (or Xerox'
`lookup` with argument -x) must be available on the PATH.

positional arguments:
  test_file             YAML file with test rules

optional arguments:
  -h, --help            show this help message and exit
  -c, --colour          Colours the output
  -X [STATFILE], --statistics [STATFILE]
                        XML file that statistics are to be stored in (Default: quality-stats.xml)
  -C, --compact         Makes output more compact
  -i, --ignore-extra-analyses
                        Ignore extra analyses when there are more than
                        expected, will PASS if the expected one is found.
  -s, --surface         Surface input/analysis tests only
  -l, --lexical         Lexical input/generation tests only
  -f, --hide-fails      Suppresses passes to make finding failures easier
  -p, --hide-passes     Suppresses failures to make finding passes easier
  -S SECTION, --section SECTION
                        The section to be used for testing (default is `hfst`)
  -t TEST, --test TEST  Which test to run (Default: all). TEST = test ID, e.g.
                        'Noun - gåetie' (remember quotes if the ID contains
                        spaces)
  -v, --verbose         More verbose output.
  --app APP             Override application used for test
  --gen GEN             Override generation transducer used for test
  --morph MORPH         Override morph transducer used for test

Will run all tests in the test_file by default.

Usage[edit]

aq-morftest tgl.yaml -X