Difference between revisions of "Apertium-quality/Configuration"
(Created page with '== Apertium Quality XML Configuration Format == === Introduction === The apertium quality configuration format is a fairly simple XML format for declaring which files are require…') |
|||
(2 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
== Apertium Quality XML Configuration Format == |
== Apertium Quality XML Configuration Format == |
||
⚫ | |||
− | === Introduction === |
||
⚫ | |||
The format is essentially as follows: |
The format is essentially as follows: |
||
Line 17: | Line 16: | ||
* regression |
* regression |
||
* morph |
* morph |
||
+ | * generation |
||
Currently supported content elements: |
Currently supported content elements: |
||
Line 43: | Line 43: | ||
</regression> |
</regression> |
||
</config> |
</config> |
||
+ | </pre> |
||
+ | |||
+ | == Regression Test Format == |
||
+ | |||
+ | The format for creating regression tests is extremely simple. It is |
||
+ | simply a Mediawiki template called <code>test</code>. |
||
+ | |||
+ | === Example === |
||
+ | The parameters are pipe-delimited as follows: |
||
+ | <pre>{{test|<language>|<original text>|<expected result>}}</pre> or <pre>{{test|<language>|<original text>|<expected result>|<comment>}}</pre>. |
||
+ | |||
+ | An example of usage: |
||
+ | <pre>* {{test|fr|I am three years old.|J'ai trois ans.|Checks correct verb use}}</pre> |
||
+ | |||
+ | The above shows up on the wiki as: |
||
+ | * {{test|fr|I am three years old.|J'ai trois ans.|Checks correct verb use}}. |
||
+ | |||
+ | |||
+ | === Usage as Regression Test === |
||
+ | In order to access your wiki page as a regression test, simply add Special:Export between wiki/ and your page name. For example, http://wiki.apertium.org/wiki/French_and_Breton/Regression_tests becomes http://wiki.apertium.org/wiki/Special:Export/French_and_Breton/Regression_tests. Simply paste that link as the parameter to aq-regtest and you're good to go. |
||
+ | |||
+ | == Morph Test Format == |
||
+ | The morph testing format is simply a YAML file (a markup similar to, but simpler than, JSON) with several directives of how your morphological tests should be conducted. |
||
+ | |||
+ | Here's the basic layout: |
||
+ | <pre>Config: |
||
+ | CONFIG_OPT: |
||
+ | App: application_to_run |
||
+ | Gen: generation_file.fst |
||
+ | Morph: morf_file.fst |
||
+ | Tests: |
||
+ | Name of test to be run: |
||
+ | input: expected output |
||
+ | more input: more expected output |
||
+ | some input: [one output, another possible output] |
||
+ | </pre> |
||
+ | |||
+ | As you can see, the string input doesn't require quotes to be considered a string, making writing these tests for programmers and non-programmers alike a breeze. Some strings must be quoted however, as you will see below. |
||
+ | |||
+ | === Example === |
||
+ | This is a real world example of how a test file can look: |
||
+ | <pre>Config: |
||
+ | hfst: |
||
+ | App: hfst-optimised-lookup |
||
+ | Gen: ../tr-ky.autogen.hfst |
||
+ | Morph: ../ky-tr.automorf.hfst |
||
+ | |||
+ | Tests: |
||
+ | "[twol] Мягкий знак deletion before suffix": |
||
+ | июль<n><dat> : июлга |
||
+ | |||
+ | "[twol] L desonorisation": |
||
+ | адам<n><pl><nom> : адамдар |
||
+ | адам<n><pl><nom>+э<cop><p3><sg> : адамдар |
||
+ | адам<n><pl><nom>+э<cop><p3><pl> : адамдар |
||
+ | |||
+ | "[twol] Double vowel harmony in suffix": |
||
+ | дүйнө<n><rloc> : дүйнөдөгү |
||
+ | |||
+ | "[lexc] 1st person singular possessive": |
||
+ | ат<n><px1sg><nom> : атым |
||
+ | ат<n><px1sg><nom>+э<cop><p3><sg> : атым |
||
+ | ат<n><px1sg><nom>+э<cop><p3><pl> : атым |
||
+ | салт<n><px1sg><nom> : салтым |
||
+ | салт<n><px1sg><nom>+э<cop><p3><sg> : салтым |
||
+ | салт<n><px1sg><nom>+э<cop><p3><pl> : салтым |
||
</pre> |
</pre> |
Latest revision as of 06:29, 22 August 2011
Contents
Apertium Quality XML Configuration Format[edit]
The Apertium Quality Configuration format is a fairly simple XML format for declaring which files are required for running an automatic test, or however another tool uses the data.
The format is essentially as follows:
<config> <test-type> <content attribs=stuff /> </test-type> <config>
Currently supported test types:
- coverage
- tagging
- regression
- morph
- generation
Currently supported content elements:
- test
- corpus
For a more detailed overview, see the schema.
Example[edit]
Here is a template of the XML file:
<config xmlns="http://apertium.org/xml/quality/config/0.1"> </config>
This simple document takes several elements dependant on the type of file you want to declare for each test.
<config xmlns="http://apertium.org/xml/quality/config/0.1"> <coverage> <corpus generator="gencrp.py" language="mt-he" path="relative-level-crp.txt" /> <corpus generator='someotherscript.bash' language="mt-he" path="../belowhere.crp.txt /> </coverage> <regression> <test language="mt-he" path="http://whatever/test.xml" /> <test language="mt-he" path="localfile.xml" /> </regression> </config>
Regression Test Format[edit]
The format for creating regression tests is extremely simple. It is
simply a Mediawiki template called test
.
Example[edit]
The parameters are pipe-delimited as follows:
{{test|<language>|<original text>|<expected result>}}
or
{{test|<language>|<original text>|<expected result>|<comment>}}
.
An example of usage:
* {{test|fr|I am three years old.|J'ai trois ans.|Checks correct verb use}}
The above shows up on the wiki as:
- (fr) I am three years old. → J'ai trois ans. :: Checks correct verb use.
Usage as Regression Test[edit]
In order to access your wiki page as a regression test, simply add Special:Export between wiki/ and your page name. For example, http://wiki.apertium.org/wiki/French_and_Breton/Regression_tests becomes http://wiki.apertium.org/wiki/Special:Export/French_and_Breton/Regression_tests. Simply paste that link as the parameter to aq-regtest and you're good to go.
Morph Test Format[edit]
The morph testing format is simply a YAML file (a markup similar to, but simpler than, JSON) with several directives of how your morphological tests should be conducted.
Here's the basic layout:
Config: CONFIG_OPT: App: application_to_run Gen: generation_file.fst Morph: morf_file.fst Tests: Name of test to be run: input: expected output more input: more expected output some input: [one output, another possible output]
As you can see, the string input doesn't require quotes to be considered a string, making writing these tests for programmers and non-programmers alike a breeze. Some strings must be quoted however, as you will see below.
Example[edit]
This is a real world example of how a test file can look:
Config: hfst: App: hfst-optimised-lookup Gen: ../tr-ky.autogen.hfst Morph: ../ky-tr.automorf.hfst Tests: "[twol] Мягкий знак deletion before suffix": июль<n><dat> : июлга "[twol] L desonorisation": адам<n><pl><nom> : адамдар адам<n><pl><nom>+э<cop><p3><sg> : адамдар адам<n><pl><nom>+э<cop><p3><pl> : адамдар "[twol] Double vowel harmony in suffix": дүйнө<n><rloc> : дүйнөдөгү "[lexc] 1st person singular possessive": ат<n><px1sg><nom> : атым ат<n><px1sg><nom>+э<cop><p3><sg> : атым ат<n><px1sg><nom>+э<cop><p3><pl> : атым салт<n><px1sg><nom> : салтым салт<n><px1sg><nom>+э<cop><p3><sg> : салтым салт<n><px1sg><nom>+э<cop><p3><pl> : салтым