Difference between revisions of "Google Summer of Code/Midterm report 2011"
		
		
		
		
		
		
		Jump to navigation
		Jump to search
		
				
		
		
		
		
		
		
		
	
| m | |||
| Line 6: | Line 6: | ||
| ===Turkish → Azerbaijani=== | ===Turkish → Azerbaijani=== | ||
| <pre> | |||
| Statistics about input files | |||
| ------------------------------------------------------- | |||
| Number of words in reference: 356 | |||
| Number of words in test: 364 | |||
| Number of unknown words (marked with a star) in test: 4 | |||
| Percentage of unknown words: 1.10 % | |||
| Results when removing unknown-word marks (stars) | |||
| ------------------------------------------------------- | |||
| Edit distance: 52 | |||
| Word error rate (WER): 14.29 % | |||
| Number of position-independent word errors: 50 | |||
| Position-independent word error rate (PER): 13.74 % | |||
| Statistics about the translation of unknown words | |||
| ------------------------------------------------------- | |||
| Number of unknown words which were free rides: 1 | |||
| Percentage of unknown words that were free rides: 25.00 % | |||
| </pre> | |||
Revision as of 11:27, 12 July 2011
Language pairs
For language pairs, we have two tasks, for some pairs the task was to translate a news article without any diagnostics and evaluate the output. For the other pairs, the task was to create morphological analysers with 80% coverage.
Turkish → Azerbaijani
Statistics about input files ------------------------------------------------------- Number of words in reference: 356 Number of words in test: 364 Number of unknown words (marked with a star) in test: 4 Percentage of unknown words: 1.10 % Results when removing unknown-word marks (stars) ------------------------------------------------------- Edit distance: 52 Word error rate (WER): 14.29 % Number of position-independent word errors: 50 Position-independent word error rate (PER): 13.74 % Statistics about the translation of unknown words ------------------------------------------------------- Number of unknown words which were free rides: 1 Percentage of unknown words that were free rides: 25.00 %

