From Apertium
Jump to navigation Jump to search

Apertium Quality XML Configuration Format[edit]

The Apertium Quality Configuration format is a fairly simple XML format for declaring which files are required for running an automatic test, or however another tool uses the data.

The format is essentially as follows:

    <content attribs=stuff />

Currently supported test types:

  • coverage
  • tagging
  • regression
  • morph
  • generation

Currently supported content elements:

  • test
  • corpus

For a more detailed overview, see the schema.


Here is a template of the XML file:

<config xmlns="http://apertium.org/xml/quality/config/0.1">

This simple document takes several elements dependant on the type of file you want to declare for each test.

<config xmlns="http://apertium.org/xml/quality/config/0.1">
        <corpus generator="gencrp.py" language="mt-he" path="relative-level-crp.txt" />
        <corpus generator='someotherscript.bash' language="mt-he" path="../belowhere.crp.txt />
        <test language="mt-he" path="http://whatever/test.xml" />
        <test language="mt-he" path="localfile.xml" />

Regression Test Format[edit]

The format for creating regression tests is extremely simple. It is simply a Mediawiki template called test.


The parameters are pipe-delimited as follows:

{{test|<language>|<original text>|<expected result>}}


{{test|<language>|<original text>|<expected result>|<comment>}}


An example of usage:

* {{test|fr|I am three years old.|J'ai trois ans.|Checks correct verb use}}

The above shows up on the wiki as:

  • (fr) I am three years old. → J'ai trois ans. :: Checks correct verb use.

Usage as Regression Test[edit]

In order to access your wiki page as a regression test, simply add Special:Export between wiki/ and your page name. For example, http://wiki.apertium.org/wiki/French_and_Breton/Regression_tests becomes http://wiki.apertium.org/wiki/Special:Export/French_and_Breton/Regression_tests. Simply paste that link as the parameter to aq-regtest and you're good to go.

Morph Test Format[edit]

The morph testing format is simply a YAML file (a markup similar to, but simpler than, JSON) with several directives of how your morphological tests should be conducted.

Here's the basic layout:

    App: application_to_run
    Gen: generation_file.fst
    Morph: morf_file.fst
  Name of test to be run:
    input: expected output
    more input: more expected output
    some input: [one output, another possible output]

As you can see, the string input doesn't require quotes to be considered a string, making writing these tests for programmers and non-programmers alike a breeze. Some strings must be quoted however, as you will see below.


This is a real world example of how a test file can look:

    App: hfst-optimised-lookup
    Gen: ../tr-ky.autogen.hfst
    Morph: ../ky-tr.automorf.hfst

  "[twol] Мягкий знак deletion before suffix":
    июль<n><dat> : июлга

  "[twol] L desonorisation":
    адам<n><pl><nom> : адамдар
    адам<n><pl><nom>+э<cop><p3><sg> : адамдар
    адам<n><pl><nom>+э<cop><p3><pl> : адамдар

  "[twol] Double vowel harmony in suffix":
    дүйнө<n><rloc> : дүйнөдөгү

  "[lexc] 1st person singular possessive":
    ат<n><px1sg><nom> : атым
    ат<n><px1sg><nom>+э<cop><p3><sg> : атым
    ат<n><px1sg><nom>+э<cop><p3><pl> : атым
    салт<n><px1sg><nom> : салтым
    салт<n><px1sg><nom>+э<cop><p3><sg> : салтым
    салт<n><px1sg><nom>+э<cop><p3><pl> : салтым