Difference between revisions of "Uighur and Turkish/Work plan"
Jump to navigation
Jump to search
Line 210: | Line 210: | ||
# 65% coverage, adding new stems to bidix and monodix |
# 65% coverage, adding new stems to bidix and monodix |
||
# 67% coverage, Basic CG |
# 67% coverage, Basic CG |
||
# 70% coverage, Adding inflectional |
# 70% coverage, Adding inflectional affixes to uig.lexc, writing twol rules for them |
||
# 75% coverage, Adding derivational |
# 75% coverage, Adding derivational affixes to uig.lexc, writing twol rules for them |
||
# 78% coverage, Transfer, CG |
# 78% coverage, Transfer, CG |
||
# 82% coverage, CG, lexsel |
# 82% coverage, CG, lexsel |
Revision as of 07:51, 9 May 2018
Week | Cov. goal | CG goal | Transfer goal | Lexsel goal | Corpusvoc goal | Done? | Coverage | Errors | Checkpoint | Comments |
---|---|---|---|---|---|---|---|---|---|---|
April 23-29 | 45% | 5 | 200000 | ✓ | 45.5 | 197742 | Good work! | |||
April 30 - May 6 | 65% | 10 | 198000 | ½ | 65.6 | 197742 | Good coverage, insufficient CG rules | |||
May 7-13 | 67% | |||||||||
May 14-20 | 70% | |||||||||
May 21-27 | 75% | |||||||||
May 28-June 3 | 78% | |||||||||
June 4-10 | 82% | 20 | ||||||||
June 11-17 | 84% | Eval 1 | ||||||||
June 18-24 | 85% | |||||||||
June 25 - July 1 | 85% | 20 | ||||||||
July 2-8 | 86% | |||||||||
July 9-15 | 88% | Eval 2 | ||||||||
July 16-22 | 89% | |||||||||
July 23-29 | 90% | |||||||||
July 30 - August 5 | 91% | |||||||||
August 6-14 | 92% | 0 | Final Evals |
Plan by Weeks
- 45% coverage, adding new stems to bidix and monodix
- 65% coverage, adding new stems to bidix and monodix
- 67% coverage, Basic CG
- 70% coverage, Adding inflectional affixes to uig.lexc, writing twol rules for them
- 75% coverage, Adding derivational affixes to uig.lexc, writing twol rules for them
- 78% coverage, Transfer, CG
- 82% coverage, CG, lexsel
- 84% coverage, Transfer, lexsel
- 85% coverage, Transfer, lexsel
- 85% coverage, CG, Transfer
- 86% coverage, Transfer, lexsel
- 88% coverage, Transfer, CG
- Preparing text for annotation, evaluation
- Annotating the Uyghur corpus, %90 coverage
- Annotating the Uyghur corpus, %90 coverage, Writing paper
- Writing paper
Plan Outline
- Post-application period:
- Facilitating MT of a text from Uyghur to Turkish.
- Community-bonding period:
- bidix words, up to 50%
- Month 1:
- Writing scripts
- Adding words to bidix, get coverage to around 80%
- Chunking
- Transfer rules
- Begin CG for UIG
- Month 2:
- POS tagging/constraint grammar
- Transfer rules
- Get CG rules up to 100, ~50% disambiguation
- >90% coverage
- Month 3:
- Creation of an Annotated Corpus