Difference between revisions of "Translation quality statistics"

From Apertium
Jump to navigation Jump to search
(unknown yes)
(21 intermediate revisions by 6 users not shown)
Line 1: Line 1:
This page aims to give an overview of the ''quality'' of various translators available in the Apertium platform. Word Error Rate (WER) and Position-independent Word Error Rate (PWER) are measures of post-edition effort. The number gives the expected number of words needed to be corrected in 100 words of running text. So, a WER of 4.7% indicates that in a given 100 words of text, 4.7 of them will need to be corrected by the post-editor.
This page aims to give an overview of the ''quality'' of various translators available in the Apertium platform. Word Error Rate (WER) and Position-independent Word Error Rate (PWER) are measures of post-edition effort. The number gives the expected number of words needed to be corrected in 100 words of running text. So, a WER of 4.7% indicates that in a given 100 words of text, 4.7 of them will need to be corrected by the post-editor.

Precise numbers may vary due to differences in how sentences are selected to be evaluated. In some pairs, unknown words may be taken into account, in others not. Evaluations where unknown words are allowed will likely give me accurate numbers for postedition error, providing the corpus on which the evaluation was made resembles the corpus on which further translations will be made. Evaluations not allowing unknown words will give a better indication of "best-case" working of transfer rules.


{|class="wikitable"
{|class="wikitable"
! Translator !! Date !! Version !! Direction !! WER !! PWER !! BLEU !! Reference
! Translator !! Date !! Version !! Direction !! Unknown<br/>words !! WER !! PWER !! BLEU !! Reference / Notes
|-
|-
|rowspan=2| <code>apertium-nn-nb</code> ||rowspan=2|12th October 2009||rowspan=2| 0.6.1 || nnnb || - || - || - ||rowspan=2|
|rowspan=2| <code>[[apertium-eo-fr]]</code> ||rowspan=2|11th&nbsp;February&nbsp;2011 ||rowspan=2| || freo ||rowspan=2 {{yes}} || 22.4 % || 20.6 % || - ||rowspan=2| [[French_and_Esperanto/Quality_tests]]
|-
|-
| nbnn || - || - || -
| eofr || - || - || -
|-
|-
|rowspan=2| <code>apertium-sv-da</code> ||rowspan=2|12th Oct 2009 ||rowspan=2| 0.5.0 || svda || 30.3 % || 27.7 % || - ||rowspan=2| http://wiki.apertium.org/w/index.php?title=Swedish_and_Danish/Evaluation&oldid=14881
|rowspan=2| <code>[[apertium-mk-en]]</code> ||rowspan=2|19th&nbsp;September&nbsp;2010||rowspan=2| 0.1.0 || mken ||rowspan=2 {{no}} || 43.96% || 31.22% || - ||rowspan=2| Percentage is average of 1,000 words from SETimes and 1,000 from Wikipedia
|-
| en → mk || - || - ||
|-
|rowspan=2| <code>[[apertium-mk-bg]]</code> ||rowspan=2|31st&nbsp;August&nbsp;2010||rowspan=2| 0.1.0 || mk → bg ||rowspan=2 {{yes}} || 26.67 % || 25.39 % || - ||rowspan=2| -
|-
| bg → mk || - || - ||
|-
|rowspan=2| <code>[[apertium-nn-nb]]</code> ||rowspan=2|12th&nbsp;October&nbsp;2009||rowspan=2| 0.6.1 || nn → nb ||rowspan=2 {{yes}} || - || - || - ||rowspan=2| Unhammer and Trosterud, 2009<br/> (two reference translations)
|-
| nb → nn ||32.5%, 17.7% || - || 0.74
|-
|rowspan=2| <code>[[apertium-br-fr]]</code> ||rowspan=2| March&nbsp;2010 ||rowspan=2| 0.2.0 || br → fr ||rowspan=2 {{no}} || 38 % || 22 % || - ||rowspan=2| Tyers, 2010
|-
| fr → br || - || - || -
|-
|rowspan=2| <code>[[apertium-sv-da]]</code> ||rowspan=2|12th&nbsp;October&nbsp;2009 ||rowspan=2| 0.5.0 || sv → da ||rowspan=2 {{yes}} || 30.3 % || 27.7 % || - ||rowspan=2| [http://wiki.apertium.org/w/index.php?title=Swedish_and_Danish/Evaluation&oldid=14881 Swedish_and_Danish/Evaluation]
|-
|-
| da → sv || - || - || -
| da → sv || - || - || -
|-
|-
|rowspan=2| <code>apertium-eu-es</code> ||rowspan=2|2nd September 2009 ||rowspan=2| || eu → es || 72.4 % || 39.8 % || - ||rowspan=2| Ginestí-Rosell et al., 2009
|rowspan=2| <code>[[apertium-eu-es]]</code> ||rowspan=2|2nd&nbsp;September&nbsp;2009 ||rowspan=2| || eu → es ||rowspan=2 {{unknown}} || 72.4 % || 39.8 % || - ||rowspan=2| Ginestí-Rosell et al., 2009
|-
|-
| es → eu || - || - || -
| es → eu || - || - || -
|-
|-
|rowspan=2| <code>apertium-cy-en</code> ||rowspan=2|2nd January 2009 ||rowspan=2| || cy → en || 55.7 % || 30.5 % || - ||rowspan=2| Tyers and Donnelly, 2009
|rowspan=2| <code>[[apertium-cy-en]]</code> ||rowspan=2|2nd&nbsp;January&nbsp;2009 ||rowspan=2| || cy → en ||rowspan=2 {{unknown}} || 55.7 % || 30.5 % || - ||rowspan=2| Tyers and Donnelly, 2009
|-
|-
| en → cy || - || - || -
| en → cy || - || - || -
|-
|-
|rowspan=2| <code>apertium-eo-en</code> ||rowspan=2|08 May 2009 ||rowspan=2| 0.9.0 || en → eo || 21.0 % || 19.0 % || - ||rowspan=2| http://wiki.apertium.org/w/index.php?title=English_and_Esperanto/Evaluation&oldid=12418
|rowspan=2| <code>[[apertium-eo-en]]</code> ||rowspan=2|8th&nbsp;May&nbsp;2009 ||rowspan=2| 0.9.0 || en → eo ||rowspan=2 {{unknown}} || 21.0 % || 19,0 % || - ||rowspan=2| [http://wiki.apertium.org/w/index.php?title=English_and_Esperanto/Evaluation&oldid=12418 English_and_Esperanto/Evaluation]
|-
|-
| eo → en || - || - || -
| eo → en || - || - || -
|-
|-
|rowspan=2| <code>apertium-es-pt</code> ||rowspan=2|15th May 2006 ||rowspan=2| || es → pt || 4.7 % || - || - ||rowspan=2| Armentano et al., 2006
|rowspan=2| <code>[[apertium-es-pt]]</code> ||rowspan=2|15th&nbsp;May&nbsp;2006 ||rowspan=2| || es → pt ||rowspan=2 {{unknown}} || 4.7 % || - || - ||rowspan=2| Armentano et al., 2006
|-
|-
| pt → es || 11.3 % || - || -
| pt → es || 11.3 % || - || -
|-
|-
|rowspan=2| <code>apertium-oc-ca</code> ||rowspan=2|10th May 2006 ||rowspan=2| || oc → ca || 9.6 % || - || - ||rowspan=2| Armentano and Forcada, 2006
|rowspan=2| <code>[[apertium-oc-ca]]</code> ||rowspan=2|10th&nbsp;May&nbsp;2006 ||rowspan=2| || oc → ca ||rowspan=2 {{unknown}} || 9.6 % || - || - ||rowspan=2| Armentano and Forcada, 2006
|-
|-
| ca → oc || - || - || -
| ca → oc || - || - || -
Line 34: Line 52:




|rowspan=2| <code>apertium-pt-ca</code> ||rowspan=2| 28th July 2008 ||rowspan=2| || pt → ca || 16.6% || - || - ||rowspan=2| Armentano and Forcada, 2008
|rowspan=2| <code>[[apertium-pt-ca]]</code> ||rowspan=2| 28th&nbsp;July&nbsp;2008 ||rowspan=2| || pt → ca ||rowspan=2 {{unknown}} || 16.6% || - || - ||rowspan=2| Armentano and Forcada, 2008
|-
|-
| ca → pt || 14.1% || - || -
| ca → pt || 14.1% || - || -
|-
|-
|rowspan=2| <code>apertium-en-es</code> ||rowspan=2| May 2009 ||rowspan=2| || en → es || - || - || 18.51%
|rowspan=2| <code>[[apertium-en-es]]</code> ||rowspan=2| May&nbsp;2009 ||rowspan=2| || en → es ||rowspan=2 {{unknown}} || - || - || 0.1851
|rowspan=2| Sánchez-Martínez, 2009
|rowspan=2| Sánchez-Martínez, 2009
|-
|-
| es → en || - || - || 18.81%
| es → en || - || - || 0.1881
|-
|-
|}
|}
Line 49: Line 67:
==References==
==References==


* Armentano-Oller, C., Carrasco, R. C. Corbí-Bellot, A. M., Forcada, M. L., Ginestí-Rosell, M., Ortiz-Rojas, S., Pérez-Ortiz, J. A., Ramírez-Sánchez, G., Sánchez-Martínez, F., Scalco, M. A. (2006) "Open-source Portuguese-Spanish machine translation", in In ''Lecture Notes in Computer Science 3960 (Computational Processing of the Portuguese Language, Proceedings of the 7th International Workshop on Computational Processing of Written and Spoken Portuguese, PROPOR 2006), May 13-17, 2006, ME - RJ / Itatiaia, Rio de Janeiro, Brazil.'' , p. 50-59
* Armentano-Oller, C., Carrasco, R. C. Corbí-Bellot, A. M., Forcada, M. L., Ginestí-Rosell, M., Ortiz-Rojas, S., Pérez-Ortiz, J. A., Ramírez-Sánchez, G., Sánchez-Martínez, F., Scalco, M. A. (2006) "Open-source Portuguese-Spanish machine translation", in In ''Lecture Notes in Computer Science 3960 (Computational Processing of the Portuguese Language, Proceedings of the 7th International Workshop on Computational Processing of Written and Spoken Portuguese, PROPOR 2006), May 13-17, 2006, ME - RJ / Itatiaia, Rio de Janeiro, Brazil. , p. 50-59
* Armentano-Oller, C. and Forcada, M. L. (2006) "Open-source machine translation between small languages: Catalan and Aranese Occitan", in ''Strategies for developing machine translation for minority languages (5th SALTMIL workshop on Minority Languages) (organized in conjunction with LREC 2006 (22-28.05.2006))'' , p. 51-54
* Armentano-Oller, C. and Forcada, M. L. (2006) "Open-source machine translation between small languages: Catalan and Aranese Occitan", in ''Strategies for developing machine translation for minority languages (5th SALTMIL workshop on Minority Languages) (organized in conjunction with LREC 2006 (22-28.05.2006))'' , p. 51-54
* C. Armentano-Oller, M.L. Forcada, "[http://www.dlsi.ua.es/~mlf/docum/armentanoooller08j.pdf Reutilización de datos lingüísticos para la creación de un sistema de traducción automática para un nuevo par de lenguas]", Procesamiento del Lenguaje Natural, :41, 243-250
* Armentano-Oller, C., M.L. Forcada, "[http://www.dlsi.ua.es/~mlf/docum/armentanoooller08j.pdf Reutilización de datos lingüísticos para la creación de un sistema de traducción automática para un nuevo par de lenguas]", Procesamiento del Lenguaje Natural, :41, 243-250
* Ginestí-Rosell, M. and Ramírez-Sánchez, G. and Ortiz-Rojas, S. and Tyers, F. M. and Forcada, M. L. (2009) "[http://xixona.dlsi.ua.es/~fran/publications/sepln2009.pdf Development of a free Basque to Spanish machine translation system]". ''Procesamiento de Lenguaje Natural''. No. 43, pp. 185--197
* Ginestí-Rosell, M. and Ramírez-Sánchez, G. and Ortiz-Rojas, S. and Tyers, F. M. and Forcada, M. L. (2009) "[http://xixona.dlsi.ua.es/~fran/publications/sepln2009.pdf Development of a free Basque to Spanish machine translation system]". ''Procesamiento de Lenguaje Natural''. No. 43, pp. 185--197
* Tyers, F. M. and Donnelly, K. (2009) "[http://xixona.dlsi.ua.es/~fran/publications/mtm2009.pdf apertium-cy - a collaboratively-developed free RBMT system for Welsh to English]". The Prague Bulletin of Mathematical Linguistics No. 91, pp. 57--66.
* Tyers, F. M. and Donnelly, K. (2009) "[http://xixona.dlsi.ua.es/~fran/publications/mtm2009.pdf apertium-cy - a collaboratively-developed free RBMT system for Welsh to English]". The Prague Bulletin of Mathematical Linguistics No. 91, pp. 57--66.
* Felipe Sánchez-Martínez, Mikel L. Forcada, Andy Way. "[http://xixona.dlsi.ua.es/~fsanchez/publications/sanchez-martinez2009d.pdf Hybrid rule-based ‒ example-based MT: Feeding Apertium with sub-sentential translation units]". In Proceedings of the 3rd Workshop on Example-Based Machine Translation, p. 11-18, November 12-13, 2009, Dublin, Ireland.
* Sánchez-Martínez Felipe; Mikel L. Forcada; Andy Way. "[http://www.dlsi.ua.es/~fsanchez/pub/pdf/sanchez-martinez09d.pdf Hybrid rule-based ‒ example-based MT: Feeding Apertium with sub-sentential translation units]". In Proceedings of the 3rd Workshop on Example-Based Machine Translation, p. 11-18, November 12-13, 2009, Dublin, Ireland.
* Unhammer, Kevin; Trosterud, Trond. "[http://rua.ua.es/dspace/handle/10045/12025 Reuse of free resources in machine translation between Nynorsk and Bokmål]". In: Proceedings of the First International Workshop on Free/Open-Source Rule-Based Machine Translation / Edited by Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez, Francis M. Tyers. Alicante : Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos, 2009, pp. 35-42

[[Category:Evaluation]]
[[Category:Evaluation]]
[[Category:Documentation in English]]

Revision as of 11:14, 7 June 2012

This page aims to give an overview of the quality of various translators available in the Apertium platform. Word Error Rate (WER) and Position-independent Word Error Rate (PWER) are measures of post-edition effort. The number gives the expected number of words needed to be corrected in 100 words of running text. So, a WER of 4.7% indicates that in a given 100 words of text, 4.7 of them will need to be corrected by the post-editor.

Precise numbers may vary due to differences in how sentences are selected to be evaluated. In some pairs, unknown words may be taken into account, in others not. Evaluations where unknown words are allowed will likely give me accurate numbers for postedition error, providing the corpus on which the evaluation was made resembles the corpus on which further translations will be made. Evaluations not allowing unknown words will give a better indication of "best-case" working of transfer rules.

Translator Date Version Direction Unknown
words
WER PWER BLEU Reference / Notes
apertium-eo-fr 11th February 2011 fr → eo Yes 22.4 % 20.6 % - French_and_Esperanto/Quality_tests
eo → fr - - -
apertium-mk-en 19th September 2010 0.1.0 mk → en No 43.96% 31.22% - Percentage is average of 1,000 words from SETimes and 1,000 from Wikipedia
en → mk - -
apertium-mk-bg 31st August 2010 0.1.0 mk → bg Yes 26.67 % 25.39 % - -
bg → mk - -
apertium-nn-nb 12th October 2009 0.6.1 nn → nb Yes - - - Unhammer and Trosterud, 2009
(two reference translations)
nb → nn 32.5%, 17.7% - 0.74
apertium-br-fr March 2010 0.2.0 br → fr No 38 % 22 % - Tyers, 2010
fr → br - - -
apertium-sv-da 12th October 2009 0.5.0 sv → da Yes 30.3 % 27.7 % - Swedish_and_Danish/Evaluation
da → sv - - -
apertium-eu-es 2nd September 2009 eu → es Unknown 72.4 % 39.8 % - Ginestí-Rosell et al., 2009
es → eu - - -
apertium-cy-en 2nd January 2009 cy → en Unknown 55.7 % 30.5 % - Tyers and Donnelly, 2009
en → cy - - -
apertium-eo-en 8th May 2009 0.9.0 en → eo Unknown 21.0 % 19,0 % - English_and_Esperanto/Evaluation
eo → en - - -
apertium-es-pt 15th May 2006 es → pt Unknown 4.7 % - - Armentano et al., 2006
pt → es 11.3 % - -
apertium-oc-ca 10th May 2006 oc → ca Unknown 9.6 % - - Armentano and Forcada, 2006
ca → oc - - -
apertium-pt-ca 28th July 2008 pt → ca Unknown 16.6% - - Armentano and Forcada, 2008
ca → pt 14.1% - -
apertium-en-es May 2009 en → es Unknown - - 0.1851 Sánchez-Martínez, 2009
es → en - - 0.1881


References