Difference between revisions of "Google Summer of Code/Midterm report 2011"

From Apertium
Jump to navigation Jump to search
m
Line 6: Line 6:
   
 
===Turkish → Azerbaijani===
 
===Turkish → Azerbaijani===
  +
  +
<pre>
  +
Statistics about input files
  +
-------------------------------------------------------
  +
Number of words in reference: 356
  +
Number of words in test: 364
  +
Number of unknown words (marked with a star) in test: 4
  +
Percentage of unknown words: 1.10 %
  +
  +
Results when removing unknown-word marks (stars)
  +
-------------------------------------------------------
  +
Edit distance: 52
  +
Word error rate (WER): 14.29 %
  +
Number of position-independent word errors: 50
  +
Position-independent word error rate (PER): 13.74 %
  +
  +
Statistics about the translation of unknown words
  +
-------------------------------------------------------
  +
Number of unknown words which were free rides: 1
  +
Percentage of unknown words that were free rides: 25.00 %
  +
</pre>
   
   

Revision as of 11:27, 12 July 2011

Language pairs

For language pairs, we have two tasks, for some pairs the task was to translate a news article without any diagnostics and evaluate the output. For the other pairs, the task was to create morphological analysers with 80% coverage.

Turkish → Azerbaijani

Statistics about input files
-------------------------------------------------------
Number of words in reference: 356
Number of words in test: 364
Number of unknown words (marked with a star) in test: 4
Percentage of unknown words: 1.10 %

Results when removing unknown-word marks (stars)
-------------------------------------------------------
Edit distance: 52
Word error rate (WER): 14.29 %
Number of position-independent word errors: 50
Position-independent word error rate (PER): 13.74 %

Statistics about the translation of unknown words
-------------------------------------------------------
Number of unknown words which were free rides: 1
Percentage of unknown words that were free rides: 25.00 %


Turkish → Kyrgyz

Serbo-Croatian → Macedonian

Slovenian → Spanish

Maltese → Hebrew

Bengali → English