Difference between revisions of "Meta-evaluation"
Jump to navigation
Jump to search
Firespeaker (talk | contribs) |
Firespeaker (talk | contribs) |
||
Line 27: | Line 27: | ||
== Translation accuracy == |
== Translation accuracy == |
||
[[apertium-eval-translator.pl]] and [[apertium-eval-translator-line.pl]] work well but are a bit old, and could probably benefit from being rewritten in python |
[[apertium-eval-translator.pl]] and [[apertium-eval-translator-line.pl]] work well but are a bit old, and could probably benefit from being rewritten in python |
||
== Translation cleanliness == |
|||
There are several ways to test translation cleanliness. From simplest to most involved: |
|||
* morphology expansion testvoc |
|||
* corpus testvoc |
Revision as of 22:15, 30 May 2019
Apertium language modules and translation pairs are subject to the following types of evaluation:
- Morphology coverage / regression testing
- Size of system
- Number of stems in lexc, monodix, bidix
- Number of disambiguation rules
- Number of lexical selection rules
- Number of transfer rules
- Naïve coverage
- Monolingual naïve coverage
- Trimmed naïve coverage (i.e., using a trimmed dictionary)
- Accuracy of analyses
- Precision/Recall/F-score
- Accuracy of translation
- WER/PER/BLEU
- Clenliness of translation output
- Testvoc
Morphology coverage
The tools we have for this are aq-morftest
from Apertium quality and morph-test.py.
Naïve coverage
In theory, aq-covtest does this, but mostly people write their own scripts.
A good generalised script that supports hfst and lttoolbox binaries and arbitrary corpora would be good. It should also (optionally) output hitparades (e.g., frequency lists of unknown forms in the corpus).
Translation accuracy
apertium-eval-translator.pl and apertium-eval-translator-line.pl work well but are a bit old, and could probably benefit from being rewritten in python
Translation cleanliness
There are several ways to test translation cleanliness. From simplest to most involved:
- morphology expansion testvoc
- corpus testvoc