Difference between revisions of "Apertium-quality/Configuration"
(Created page with '== Apertium Quality XML Configuration Format == === Introduction === The apertium quality configuration format is a fairly simple XML format for declaring which files are require…') |
|||
(2 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
== Apertium Quality XML Configuration Format == |
== Apertium Quality XML Configuration Format == |
||
⚫ | |||
=== Introduction === |
|||
⚫ | |||
The format is essentially as follows: |
The format is essentially as follows: |
||
Line 17: | Line 16: | ||
* regression |
* regression |
||
* morph |
* morph |
||
* generation |
|||
Currently supported content elements: |
Currently supported content elements: |
||
Line 43: | Line 43: | ||
</regression> |
</regression> |
||
</config> |
</config> |
||
</pre> |
|||
== Regression Test Format == |
|||
The format for creating regression tests is extremely simple. It is |
|||
simply a Mediawiki template called <code>test</code>. |
|||
=== Example === |
|||
The parameters are pipe-delimited as follows: |
|||
<pre>{{test|<language>|<original text>|<expected result>}}</pre> or <pre>{{test|<language>|<original text>|<expected result>|<comment>}}</pre>. |
|||
An example of usage: |
|||
<pre>* {{test|fr|I am three years old.|J'ai trois ans.|Checks correct verb use}}</pre> |
|||
The above shows up on the wiki as: |
|||
* {{test|fr|I am three years old.|J'ai trois ans.|Checks correct verb use}}. |
|||
=== Usage as Regression Test === |
|||
In order to access your wiki page as a regression test, simply add Special:Export between wiki/ and your page name. For example, http://wiki.apertium.org/wiki/French_and_Breton/Regression_tests becomes http://wiki.apertium.org/wiki/Special:Export/French_and_Breton/Regression_tests. Simply paste that link as the parameter to aq-regtest and you're good to go. |
|||
== Morph Test Format == |
|||
The morph testing format is simply a YAML file (a markup similar to, but simpler than, JSON) with several directives of how your morphological tests should be conducted. |
|||
Here's the basic layout: |
|||
<pre>Config: |
|||
CONFIG_OPT: |
|||
App: application_to_run |
|||
Gen: generation_file.fst |
|||
Morph: morf_file.fst |
|||
Tests: |
|||
Name of test to be run: |
|||
input: expected output |
|||
more input: more expected output |
|||
some input: [one output, another possible output] |
|||
</pre> |
|||
As you can see, the string input doesn't require quotes to be considered a string, making writing these tests for programmers and non-programmers alike a breeze. Some strings must be quoted however, as you will see below. |
|||
=== Example === |
|||
This is a real world example of how a test file can look: |
|||
<pre>Config: |
|||
hfst: |
|||
App: hfst-optimised-lookup |
|||
Gen: ../tr-ky.autogen.hfst |
|||
Morph: ../ky-tr.automorf.hfst |
|||
Tests: |
|||
"[twol] Мягкий знак deletion before suffix": |
|||
июль<n><dat> : июлга |
|||
"[twol] L desonorisation": |
|||
адам<n><pl><nom> : адамдар |
|||
адам<n><pl><nom>+э<cop><p3><sg> : адамдар |
|||
адам<n><pl><nom>+э<cop><p3><pl> : адамдар |
|||
"[twol] Double vowel harmony in suffix": |
|||
дүйнө<n><rloc> : дүйнөдөгү |
|||
"[lexc] 1st person singular possessive": |
|||
ат<n><px1sg><nom> : атым |
|||
ат<n><px1sg><nom>+э<cop><p3><sg> : атым |
|||
ат<n><px1sg><nom>+э<cop><p3><pl> : атым |
|||
салт<n><px1sg><nom> : салтым |
|||
салт<n><px1sg><nom>+э<cop><p3><sg> : салтым |
|||
салт<n><px1sg><nom>+э<cop><p3><pl> : салтым |
|||
</pre> |
</pre> |
Latest revision as of 06:29, 22 August 2011
Contents
Apertium Quality XML Configuration Format[edit]
The Apertium Quality Configuration format is a fairly simple XML format for declaring which files are required for running an automatic test, or however another tool uses the data.
The format is essentially as follows:
<config> <test-type> <content attribs=stuff /> </test-type> <config>
Currently supported test types:
- coverage
- tagging
- regression
- morph
- generation
Currently supported content elements:
- test
- corpus
For a more detailed overview, see the schema.
Example[edit]
Here is a template of the XML file:
<config xmlns="http://apertium.org/xml/quality/config/0.1"> </config>
This simple document takes several elements dependant on the type of file you want to declare for each test.
<config xmlns="http://apertium.org/xml/quality/config/0.1"> <coverage> <corpus generator="gencrp.py" language="mt-he" path="relative-level-crp.txt" /> <corpus generator='someotherscript.bash' language="mt-he" path="../belowhere.crp.txt /> </coverage> <regression> <test language="mt-he" path="http://whatever/test.xml" /> <test language="mt-he" path="localfile.xml" /> </regression> </config>
Regression Test Format[edit]
The format for creating regression tests is extremely simple. It is
simply a Mediawiki template called test
.
Example[edit]
The parameters are pipe-delimited as follows:
{{test|<language>|<original text>|<expected result>}}
or
{{test|<language>|<original text>|<expected result>|<comment>}}
.
An example of usage:
* {{test|fr|I am three years old.|J'ai trois ans.|Checks correct verb use}}
The above shows up on the wiki as:
- (fr) I am three years old. → J'ai trois ans. :: Checks correct verb use.
Usage as Regression Test[edit]
In order to access your wiki page as a regression test, simply add Special:Export between wiki/ and your page name. For example, http://wiki.apertium.org/wiki/French_and_Breton/Regression_tests becomes http://wiki.apertium.org/wiki/Special:Export/French_and_Breton/Regression_tests. Simply paste that link as the parameter to aq-regtest and you're good to go.
Morph Test Format[edit]
The morph testing format is simply a YAML file (a markup similar to, but simpler than, JSON) with several directives of how your morphological tests should be conducted.
Here's the basic layout:
Config: CONFIG_OPT: App: application_to_run Gen: generation_file.fst Morph: morf_file.fst Tests: Name of test to be run: input: expected output more input: more expected output some input: [one output, another possible output]
As you can see, the string input doesn't require quotes to be considered a string, making writing these tests for programmers and non-programmers alike a breeze. Some strings must be quoted however, as you will see below.
Example[edit]
This is a real world example of how a test file can look:
Config: hfst: App: hfst-optimised-lookup Gen: ../tr-ky.autogen.hfst Morph: ../ky-tr.automorf.hfst Tests: "[twol] Мягкий знак deletion before suffix": июль<n><dat> : июлга "[twol] L desonorisation": адам<n><pl><nom> : адамдар адам<n><pl><nom>+э<cop><p3><sg> : адамдар адам<n><pl><nom>+э<cop><p3><pl> : адамдар "[twol] Double vowel harmony in suffix": дүйнө<n><rloc> : дүйнөдөгү "[lexc] 1st person singular possessive": ат<n><px1sg><nom> : атым ат<n><px1sg><nom>+э<cop><p3><sg> : атым ат<n><px1sg><nom>+э<cop><p3><pl> : атым салт<n><px1sg><nom> : салтым салт<n><px1sg><nom>+э<cop><p3><sg> : салтым салт<n><px1sg><nom>+э<cop><p3><pl> : салтым