Difference between revisions of "User:Ozgay"
(add Category:GSoC 2019) |
|||
(4 intermediate revisions by one other user not shown) | |||
Line 39: | Line 39: | ||
== Work Plan == |
== Work Plan == |
||
-Post-application period: |
|||
Facilitating MT of a text from Turkmen to Turkish. |
|||
⚫ | |||
-Community-bonding period: |
|||
bidix words, up to 50% |
|||
-Month 1: |
|||
Writing scripts |
|||
Adding words to bidix, get coverage to around 80% |
|||
Chunking |
|||
Transfer rules |
|||
Begin CG for UIG |
|||
-Month 2: |
|||
POS tagging/constraint grammar |
|||
Transfer rules |
|||
Get CG rules up to 100, ~50% disambiguation |
|||
>90% coverage |
|||
-Month 3: |
|||
Creation of an Annotated Corpus |
|||
⚫ | |||
1. |
1. 50% coverage |
||
2. Basic CG |
2. Basic CG |
||
3. |
3. 60% coverage |
||
4. Transfer |
4. Transfer |
||
5. |
5. 70% coverage |
||
6. Transfer, lexical selection, |
6. Transfer, lexical selection, 80% coverage |
||
7. CG, |
7. CG, 83% coverage |
||
8. Transfer, lexsel, |
8. Transfer, lexsel, 86% coverage |
||
9. Transfer |
9. Transfer |
||
Line 100: | Line 65: | ||
10. CG, Transfer |
10. CG, Transfer |
||
11. Transfer, lexsel, |
11. Transfer, lexsel, 89% coverage |
||
12. Transfer, |
12. Transfer, 92% coverage |
||
13. Preparing text for annotation |
13. Preparing text for annotation |
||
14-16. Annotating the Turkmen corpus, % |
14-16. Annotating the Turkmen corpus, %95 coverage |
||
== Coding Challenge == |
== Coding Challenge == |
||
Line 130: | Line 93: | ||
I'm a 3nd year student of English&Turkish Translation&Interpreting at Marmara University. I'm a native speaker of Turkish. I have taken Russian classes. |
I'm a 3nd year student of English&Turkish Translation&Interpreting at Marmara University. I'm a native speaker of Turkish. I have taken Russian classes. |
||
[[Category:GSoC 2019 student proposals]] |
Latest revision as of 21:05, 8 April 2019
GSoC 2018 proposal draft to create and develop Turkmen-Turkish translation pair.
Contents
Personal Information[edit]
Name: Özge Kılıç
E-mail: ozgekilic.9@gmail.com
ITC: ozgay
Time zone: UTC+3
Why is it that you are interested in Apertium?
I'm a student of English Translation&Interpreting. My intention is to get a masters degree in Linguistics and I think this is an excellent start.
Proposal: Turkmen-Turkish MT[edit]
Which of the published tasks are you interested in? What do you plan to do?
My plan is to adopt an unreleased language pair, tuk-tur. I'll be working on it to bring it up to release quality, which will involve writing and refining rules for transfer and lexical selection that will result in a valid text in the target language.
Why should google and apertium sponsor it?
Since there is a limited number of sources of Turkmen-Turkish, this machine translation is a great opportunity for two nations to understand each other.
Resources
Wikipedia
Türkmence Sözlük
Work Plan[edit]
Plan by Weeks
1. 50% coverage
2. Basic CG
3. 60% coverage
4. Transfer
5. 70% coverage
6. Transfer, lexical selection, 80% coverage
7. CG, 83% coverage
8. Transfer, lexsel, 86% coverage
9. Transfer
10. CG, Transfer
11. Transfer, lexsel, 89% coverage
12. Transfer, 92% coverage
13. Preparing text for annotation
14-16. Annotating the Turkmen corpus, %95 coverage
Coding Challenge[edit]
I'm facilitating the translation of a about 500 words Turkmen text into Turkish.
Deliverables[edit]
WER comparable to other inter-Turkic/Romance pairs. Data for machine-learned disambiguation.
Summer Obligations and Commitments[edit]
I'll be busy with my finals in the first week of June but I'll be free at other times.
Qualification[edit]
I'm a 3nd year student of English&Turkish Translation&Interpreting at Marmara University. I'm a native speaker of Turkish. I have taken Russian classes.