Difference between revisions of "Turkic MT Improvements GSoC2019 report"
Jump to navigation
Jump to search
Line 2: | Line 2: | ||
== Commits == |
== Commits == |
||
My commits can be found |
My commits can be found below, on each depository: |
||
https://github.com/apertium/apertium-tur-uzb/commits?author=koguzhan |
[https://github.com/apertium/apertium-tur-uzb/commits?author=koguzhan Tur-Uzb] |
||
https://github.com/apertium/apertium-tur/commits?author=koguzhan |
[https://github.com/apertium/apertium-tur/commits?author=koguzhan Tur] |
||
https://github.com/apertium/apertium-uzb/commits?author=koguzhan |
[https://github.com/apertium/apertium-uzb/commits?author=koguzhan Uzb] |
||
https://github.com/apertium/apertium-uig-tur/commits?author=koguzhan |
[https://github.com/apertium/apertium-uig-tur/commits?author=koguzhan Uig-Tur] |
||
https://github.com/apertium/apertium-uig/commits?author=koguzhan |
[https://github.com/apertium/apertium-uig/commits?author=koguzhan Uig] |
||
https://github.com/apertium/apertium-tur-tat/commits?author=koguzhan |
[https://github.com/apertium/apertium-tur-tat/commits?author=koguzhan Tur-Tat] |
||
https://github.com/apertium/apertium-tat/commits?author=koguzhan |
[https://github.com/apertium/apertium-tat/commits?author=koguzhan Tat] |
||
https://github.com/apertium/apertium-tur-kir/commits?author=koguzhan |
[https://github.com/apertium/apertium-tur-kir/commits?author=koguzhan Tur-Kir] |
||
https://github.com/apertium/apertium-kir/commits?author=koguzhan |
[https://github.com/apertium/apertium-kir/commits?author=koguzhan Kir] |
||
==Transfer== |
==Transfer== |
||
Revision as of 11:12, 25 August 2019
This aim of this project was improving the following language pairs of Apertium: tur->uig, uzb->tur, kir->tur, tat->tur.
Commits
My commits can be found below, on each depository:
Tur-Uzb Tur Uzb Uig-Tur Uig Tur-Tat Tat Tur-Kir Kir
Transfer
Transfer rules were written for tur->uig and uzb->tur, using Regression Tests. They can be found here: Uighur and Uzbek.
Corpora and Coverage
L | Wiki | Bible |
---|---|---|
Tur-Uig | 53505239 words, 82.3% cov | 178233 words, 93.0% cov |
Uzb-Tur | 12730161 words, 80.8% cov | 184447 words, 81.1% cov |
Kir-Tur | 11435418 words, 82.5% cov | 184808 words, 92.0% cov |
Tat-Tur | 5792382 words, 86.4% cov | 178220 words, 91.4% cov |
Disambiguation
To correctly discern the lemma and the morphology so as to be translated correctly into the target language, Apertium uses Constraint Grammar (CG).
Lexical Selection
To determine in which context which translation of a given lemma would be selected, lexical selection is employed.