Difference between revisions of "Курсы машинного перевода для языков России/Session 7"

From Apertium
Jump to navigation Jump to search
Line 70: Line 70:
Apertium has a tool for calculating the Word error rate between a reference translation and a machine translation. The objective of this practical is to try it out on the system you have created.
Apertium has a tool for calculating the Word error rate between a reference translation and a machine translation. The objective of this practical is to try it out on the system you have created.


You will need two reference translations. The first will be the "original" text in the target language, this was created without post-editting. The second will be a post-editted version of the machine translation text.
You will need two reference translations. The first will be the "original" text in the target language, this was created without post-editting. The second will be a post-editted version of the machine translation text. When you are creating the post-editted version, take care to make only the minimal changes required to produce an adequate translation.

Revision as of 09:31, 9 January 2012

Theory

Consistency

Quality

Evaluation

Vocabulary coverage

The coverage of a system is an indication of how much of the vocabulary it covers in a given corpus or domain. For an idea of what this means, we will try translating a sentence with different levels of coverage:

Sentence Coverage
Селскостопанските отрасли в Косово и Македония ще получат тласък.
Селскостопанските отрасли en Косово и Македония ще получат тласък.
11%
Селскостопанските отрасли в Косово и Македония ще получат тласък.
Селскостопанските отрасли en Косово y Македония ще получат тласък.
22%
Селскостопанските отрасли в Косово и Македония ще получат тласък.
Селскостопанските отрасли en Косово y Македония получат тласък.
33%
Селскостопанските отрасли в Косово и Македония ще получат тласък.
Селскостопанските отрасли en Косово y Македония recibirá тласък.
44%
Селскостопанските отрасли в Косово и Македония ще получат тласък.
El agrícola отрасли en Косово y Македония recibirá тласък.
55%
Селскостопанските отрасли в Косово и Македония ще получат тласък.
El agrícola отрасли en Косово y Македония recibirá empujón.
66%
Селскостопанските отрасли в Косово и Македония ще получат тласък.
El sector agrícola en Косово y Македония recibirá empujón.
77%
Селскостопанските отрасли в Косово и Македония ще получат тласък.
El sector agrícola en Косово y Macedonia recibirá empujón.
88%
Селскостопанските отрасли в Косово и Македония ще получат тласък.
El sector agrícola en Kosovo y Macedonia recibirá empujón.
100%

Usually, coverage is given over a set of sentences, or corpus, instead of over a single sentence. In Apertium, the baseline coverage for releasing a new prototype translator is around 80%, or 2 unknown words in 10 for a given corpus. This is not enough to make revision practical, except in the case of closely-related languages.

Error rate

While the coverage gives you an idea of how many words you will have to change in the best case, that is, that the rest of the translation is correct. A more accurate indication of how many words you will have to change when using the translator is given by post-edition word error rate (often abbreviated as wer). This is given as a percentage of changes (insertions, deletions, substitutions) between a machine translated sentence, and a sentence which has been revised by a human translator.

Taking the example above:

Changes wer
Original Селскостопанските отрасли в Косово и Македония ще получат тласък.
Machine translation El sector agrícola en Kosovo y Macedonia recibirá empujón.
   substitute El sector agricultura en Kosovo y Macedonia recibirá impulso. 2/9
   insert El sector de la agricultura en Kosovo y Macedonia recibirá un impulso. 3/9
Revised El sector de la agricultura en Kosovo y Macedonia recibirá un impulso. 5/9 55.56%

As with coverage, error rate evaluation is usually carried out on a corpus of sentences. So it gives you an indication of how many words you are likely to have to change in a given sentence.

When calculated over an appropriate corpus of the target translation domain, the combination of word error rate and coverage can give an idea of the usefulness of a machine translation system for a specific task. Of course, to determine if a system is useful for translators, a more thorough and case-specific evaluation needs to be made.


Practice

Word error rate

Apertium has a tool for calculating the Word error rate between a reference translation and a machine translation. The objective of this practical is to try it out on the system you have created.

You will need two reference translations. The first will be the "original" text in the target language, this was created without post-editting. The second will be a post-editted version of the machine translation text. When you are creating the post-editted version, take care to make only the minimal changes required to produce an adequate translation.