Difference between revisions of "Apertium-quality/Configuration"

From Apertium
Jump to navigation Jump to search
(Created page with '== Apertium Quality XML Configuration Format == === Introduction === The apertium quality configuration format is a fairly simple XML format for declaring which files are require…')
 
 
(2 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== Apertium Quality XML Configuration Format ==
== Apertium Quality XML Configuration Format ==
The Apertium Quality Configuration format is a fairly simple XML format for declaring which files are required for running an automatic test, or however another tool uses the data.
=== Introduction ===
The apertium quality configuration format is a fairly simple XML format for declaring which files are required for running an automatic test, or however another tool uses the data.


The format is essentially as follows:
The format is essentially as follows:
Line 17: Line 16:
* regression
* regression
* morph
* morph
* generation


Currently supported content elements:
Currently supported content elements:
Line 43: Line 43:
</regression>
</regression>
</config>
</config>
</pre>

== Regression Test Format ==

The format for creating regression tests is extremely simple. It is
simply a Mediawiki template called <code>test</code>.

=== Example ===
The parameters are pipe-delimited as follows:
<pre>{{test|<language>|<original text>|<expected result>}}</pre> or <pre>{{test|<language>|<original text>|<expected result>|<comment>}}</pre>.

An example of usage:
<pre>* {{test|fr|I am three years old.|J'ai trois ans.|Checks correct verb use}}</pre>

The above shows up on the wiki as:
* {{test|fr|I am three years old.|J'ai trois ans.|Checks correct verb use}}.


=== Usage as Regression Test ===
In order to access your wiki page as a regression test, simply add Special:Export between wiki/ and your page name. For example, http://wiki.apertium.org/wiki/French_and_Breton/Regression_tests becomes http://wiki.apertium.org/wiki/Special:Export/French_and_Breton/Regression_tests. Simply paste that link as the parameter to aq-regtest and you're good to go.

== Morph Test Format ==
The morph testing format is simply a YAML file (a markup similar to, but simpler than, JSON) with several directives of how your morphological tests should be conducted.

Here's the basic layout:
<pre>Config:
CONFIG_OPT:
App: application_to_run
Gen: generation_file.fst
Morph: morf_file.fst
Tests:
Name of test to be run:
input: expected output
more input: more expected output
some input: [one output, another possible output]
</pre>

As you can see, the string input doesn't require quotes to be considered a string, making writing these tests for programmers and non-programmers alike a breeze. Some strings must be quoted however, as you will see below.

=== Example ===
This is a real world example of how a test file can look:
<pre>Config:
hfst:
App: hfst-optimised-lookup
Gen: ../tr-ky.autogen.hfst
Morph: ../ky-tr.automorf.hfst

Tests:
"[twol] Мягкий знак deletion before suffix":
июль<n><dat> : июлга

"[twol] L desonorisation":
адам<n><pl><nom> : адамдар
адам<n><pl><nom>+э<cop><p3><sg> : адамдар
адам<n><pl><nom>+э<cop><p3><pl> : адамдар

"[twol] Double vowel harmony in suffix":
дүйнө<n><rloc> : дүйнөдөгү

"[lexc] 1st person singular possessive":
ат<n><px1sg><nom> : атым
ат<n><px1sg><nom>+э<cop><p3><sg> : атым
ат<n><px1sg><nom>+э<cop><p3><pl> : атым
салт<n><px1sg><nom> : салтым
салт<n><px1sg><nom>+э<cop><p3><sg> : салтым
салт<n><px1sg><nom>+э<cop><p3><pl> : салтым
</pre>
</pre>

Latest revision as of 06:29, 22 August 2011

Apertium Quality XML Configuration Format[edit]

The Apertium Quality Configuration format is a fairly simple XML format for declaring which files are required for running an automatic test, or however another tool uses the data.

The format is essentially as follows:

<config>
  <test-type>
    <content attribs=stuff />
  </test-type>
<config>

Currently supported test types:

  • coverage
  • tagging
  • regression
  • morph
  • generation

Currently supported content elements:

  • test
  • corpus

For a more detailed overview, see the schema.

Example[edit]

Here is a template of the XML file:

<config xmlns="http://apertium.org/xml/quality/config/0.1">
</config>

This simple document takes several elements dependant on the type of file you want to declare for each test.

<config xmlns="http://apertium.org/xml/quality/config/0.1">
    <coverage>
        <corpus generator="gencrp.py" language="mt-he" path="relative-level-crp.txt" />
        <corpus generator='someotherscript.bash' language="mt-he" path="../belowhere.crp.txt />
    </coverage>
    <regression>
        <test language="mt-he" path="http://whatever/test.xml" />
        <test language="mt-he" path="localfile.xml" />
    </regression>
</config>

Regression Test Format[edit]

The format for creating regression tests is extremely simple. It is simply a Mediawiki template called test.

Example[edit]

The parameters are pipe-delimited as follows:

{{test|<language>|<original text>|<expected result>}}

or

{{test|<language>|<original text>|<expected result>|<comment>}}

.

An example of usage:

* {{test|fr|I am three years old.|J'ai trois ans.|Checks correct verb use}}

The above shows up on the wiki as:

  • (fr) I am three years old. → J'ai trois ans. :: Checks correct verb use.


Usage as Regression Test[edit]

In order to access your wiki page as a regression test, simply add Special:Export between wiki/ and your page name. For example, http://wiki.apertium.org/wiki/French_and_Breton/Regression_tests becomes http://wiki.apertium.org/wiki/Special:Export/French_and_Breton/Regression_tests. Simply paste that link as the parameter to aq-regtest and you're good to go.

Morph Test Format[edit]

The morph testing format is simply a YAML file (a markup similar to, but simpler than, JSON) with several directives of how your morphological tests should be conducted.

Here's the basic layout:

Config:
  CONFIG_OPT:
    App: application_to_run
    Gen: generation_file.fst
    Morph: morf_file.fst
Tests:
  Name of test to be run:
    input: expected output
    more input: more expected output
    some input: [one output, another possible output]

As you can see, the string input doesn't require quotes to be considered a string, making writing these tests for programmers and non-programmers alike a breeze. Some strings must be quoted however, as you will see below.

Example[edit]

This is a real world example of how a test file can look:

Config:
  hfst:
    App: hfst-optimised-lookup
    Gen: ../tr-ky.autogen.hfst
    Morph: ../ky-tr.automorf.hfst

Tests:
  "[twol] Мягкий знак deletion before suffix":
    июль<n><dat> : июлга

  "[twol] L desonorisation":
    адам<n><pl><nom> : адамдар
    адам<n><pl><nom>+э<cop><p3><sg> : адамдар
    адам<n><pl><nom>+э<cop><p3><pl> : адамдар

  "[twol] Double vowel harmony in suffix":
    дүйнө<n><rloc> : дүйнөдөгү

  "[lexc] 1st person singular possessive":
    ат<n><px1sg><nom> : атым
    ат<n><px1sg><nom>+э<cop><p3><sg> : атым
    ат<n><px1sg><nom>+э<cop><p3><pl> : атым
    салт<n><px1sg><nom> : салтым
    салт<n><px1sg><nom>+э<cop><p3><sg> : салтым
    салт<n><px1sg><nom>+э<cop><p3><pl> : салтым