Difference between revisions of "Apertium-quality/Configuration"

From Apertium
Jump to navigation Jump to search
Line 1: Line 1:
 
== Apertium Quality XML Configuration Format ==
 
== Apertium Quality XML Configuration Format ==
 
The Apertium Quality Configuration format is a fairly simple XML format for declaring which files are required for running an automatic test, or however another tool uses the data.
=== Introduction ===
 
The apertium quality configuration format is a fairly simple XML format for declaring which files are required for running an automatic test, or however another tool uses the data.
 
   
 
The format is essentially as follows:
 
The format is essentially as follows:
Line 64: Line 63:
 
In order to access your wiki page as a regression test, simply add Special:Export between wiki/ and your page name. For example, http://wiki.apertium.org/wiki/French_and_Breton/Regression_tests becomes http://wiki.apertium.org/wiki/Special:Export/French_and_Breton/Regression_tests. Simply paste that link as the parameter to aq-regtest and you're good to go.
 
In order to access your wiki page as a regression test, simply add Special:Export between wiki/ and your page name. For example, http://wiki.apertium.org/wiki/French_and_Breton/Regression_tests becomes http://wiki.apertium.org/wiki/Special:Export/French_and_Breton/Regression_tests. Simply paste that link as the parameter to aq-regtest and you're good to go.
   
<!--== Morph Test Format ==-->
+
== Morph Test Format ==
  +
The morph testing format is simply a YAML file (a markup similar to, but simpler than, JSON) with several directives of how your morphological tests should be conducted.
  +
  +
Here's the basic layout:
  +
<pre>Config:
  +
CONFIG_OPT:
  +
App: application_to_run
  +
Gen: generation_file.fst
  +
Morph: morf_file.fst
  +
Tests:
  +
Name of test to be run:
  +
input: expected output
  +
more input: more expected output
  +
some input: [one output, another possible output]
  +
</pre>
  +
  +
As you can see, the string input doesn't require quotes to be considered a string, making writing these tests for programmers and non-programmers alike a breeze. Some strings must be quoted however, as you will see below.
  +
  +
=== Example ===
  +
This is a real world example of how a test file can look:
  +
<pre>Config:
  +
hfst:
  +
App: hfst-optimised-lookup
  +
Gen: ../tr-ky.autogen.hfst
  +
Morph: ../ky-tr.automorf.hfst
  +
  +
Tests:
  +
"[twol] Мягкий знак deletion before suffix":
  +
июль<n><dat> : июлга
  +
  +
"[twol] L desonorisation":
  +
адам<n><pl><nom> : адамдар
  +
адам<n><pl><nom>+э<cop><p3><sg> : адамдар
  +
адам<n><pl><nom>+э<cop><p3><pl> : адамдар
  +
  +
"[twol] Double vowel harmony in suffix":
  +
дүйнө<n><rloc> : дүйнөдөгү
  +
  +
"[lexc] 1st person singular possessive":
  +
ат<n><px1sg><nom> : атым
  +
ат<n><px1sg><nom>+э<cop><p3><sg> : атым
  +
ат<n><px1sg><nom>+э<cop><p3><pl> : атым
  +
салт<n><px1sg><nom> : салтым
  +
салт<n><px1sg><nom>+э<cop><p3><sg> : салтым
  +
салт<n><px1sg><nom>+э<cop><p3><pl> : салтым
  +
</pre>

Revision as of 06:20, 22 August 2011

Apertium Quality XML Configuration Format

The Apertium Quality Configuration format is a fairly simple XML format for declaring which files are required for running an automatic test, or however another tool uses the data.

The format is essentially as follows:

<config>
  <test-type>
    <content attribs=stuff />
  </test-type>
<config>

Currently supported test types:

  • coverage
  • tagging
  • regression
  • morph

Currently supported content elements:

  • test
  • corpus

For a more detailed overview, see the schema.

Example

Here is a template of the XML file:

<config xmlns="http://apertium.org/xml/quality/config/0.1">
</config>

This simple document takes several elements dependant on the type of file you want to declare for each test.

<config xmlns="http://apertium.org/xml/quality/config/0.1">
    <coverage>
        <corpus generator="gencrp.py" language="mt-he" path="relative-level-crp.txt" />
        <corpus generator='someotherscript.bash' language="mt-he" path="../belowhere.crp.txt />
    </coverage>
    <regression>
        <test language="mt-he" path="http://whatever/test.xml" />
        <test language="mt-he" path="localfile.xml" />
    </regression>
</config>

Regression Test Format

The format for creating regression tests is extremely simple. It is simply a Mediawiki template called test.

Example

The parameters are pipe-delimited as follows:

{{test|<language>|<original text>|<expected result>}}

or

{{test|<language>|<original text>|<expected result>|<comment>}}

.

An example of usage:

* {{test|fr|I am three years old.|J'ai trois ans.|Checks correct verb use}}

The above shows up on the wiki as:

  • (fr) I am three years old. → J'ai trois ans. :: Checks correct verb use.


Usage as Regression Test

In order to access your wiki page as a regression test, simply add Special:Export between wiki/ and your page name. For example, http://wiki.apertium.org/wiki/French_and_Breton/Regression_tests becomes http://wiki.apertium.org/wiki/Special:Export/French_and_Breton/Regression_tests. Simply paste that link as the parameter to aq-regtest and you're good to go.

Morph Test Format

The morph testing format is simply a YAML file (a markup similar to, but simpler than, JSON) with several directives of how your morphological tests should be conducted.

Here's the basic layout:

Config:
  CONFIG_OPT:
    App: application_to_run
    Gen: generation_file.fst
    Morph: morf_file.fst
Tests:
  Name of test to be run:
    input: expected output
    more input: more expected output
    some input: [one output, another possible output]

As you can see, the string input doesn't require quotes to be considered a string, making writing these tests for programmers and non-programmers alike a breeze. Some strings must be quoted however, as you will see below.

Example

This is a real world example of how a test file can look:

Config:
  hfst:
    App: hfst-optimised-lookup
    Gen: ../tr-ky.autogen.hfst
    Morph: ../ky-tr.automorf.hfst

Tests:
  "[twol] Мягкий знак deletion before suffix":
    июль<n><dat> : июлга

  "[twol] L desonorisation":
    адам<n><pl><nom> : адамдар
    адам<n><pl><nom>+э<cop><p3><sg> : адамдар
    адам<n><pl><nom>+э<cop><p3><pl> : адамдар

  "[twol] Double vowel harmony in suffix":
    дүйнө<n><rloc> : дүйнөдөгү

  "[lexc] 1st person singular possessive":
    ат<n><px1sg><nom> : атым
    ат<n><px1sg><nom>+э<cop><p3><sg> : атым
    ат<n><px1sg><nom>+э<cop><p3><pl> : атым
    салт<n><px1sg><nom> : салтым
    салт<n><px1sg><nom>+э<cop><p3><sg> : салтым
    салт<n><px1sg><nom>+э<cop><p3><pl> : салтым