Difference between revisions of "Uighur and Turkish/Work plan"

From Apertium
Jump to navigation Jump to search
m
 
(24 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 
{| class="wikitable"
 
{| class="wikitable"
 
!Week
 
!Week
  +
! Cov. goal
! Coverage
 
  +
! CG goal
  +
! Transfer goal
  +
! Lexsel goal
  +
! Corpusvoc goal
 
! Done?
 
! Done?
  +
! Coverage
  +
! Errors
  +
! Checkpoint
  +
! Comments
 
|-
 
|-
 
|April 23-29
 
|April 23-29
|30%
+
|45%
  +
| 5
 
|
 
|
  +
|
  +
| 200000
  +
|style="background-color: green"| '''✓'''
  +
| 45.5
  +
| 197742
  +
|
  +
| Good work!
 
|-
 
|-
 
|April 30 - May 6
 
|April 30 - May 6
|35%
+
|65%
|
+
| 10
  +
|
  +
|
  +
| 198000
  +
|style="background-color: yellow"| '''½'''
  +
| 65.6
  +
| 197742
  +
|
  +
| Good coverage, insufficient CG rules
 
|-
 
|-
 
|May 7-13
 
|May 7-13
|45%
+
|67%
 
|
 
|
  +
|
  +
|
  +
|
  +
| style="background-color: green"| '''✓'''
  +
|
  +
|
  +
|
  +
|
 
|-
 
|-
 
|May 14-20
 
|May 14-20
|55%
+
|70%
 
|
 
|
  +
|
  +
|
  +
|
  +
| style="background-color: green"| '''✓'''
  +
|
  +
|
  +
|
  +
|
 
|-
 
|-
 
| May 21-27
 
| May 21-27
|65%
+
|75%
 
|
 
|
  +
|
  +
|
  +
|
  +
|style="background-color: green"| '''✓'''
  +
|
  +
|
  +
|
  +
|
 
|-
 
|-
 
| May 28-June 3
 
| May 28-June 3
|75%
+
|78%
  +
|
  +
|
  +
|
  +
|
  +
| style="background-color: green"| '''✓'''
  +
| 81.42
  +
|18334
  +
|
 
|
 
|
 
|-
 
|-
 
| June 4-10
 
| June 4-10
|80%
+
|82%
  +
| 20
  +
|
  +
|
  +
|
  +
| style="background-color: green"| '''✓'''
 
|
 
|
  +
|
  +
|
  +
|
 
|-
 
|-
 
| June 11-17
 
| June 11-17
 
|84%
 
|84%
 
|
 
|
  +
|
  +
|
  +
|
  +
| style="background-color: green"| '''✓'''
  +
|
  +
|
  +
| Eval 1
  +
|
 
|-
 
|-
 
| June 18-24
 
| June 18-24
 
|85%
 
|85%
 
|
 
|
  +
|
  +
|
  +
|
  +
| style="background-color: green"| '''✓'''
  +
|
  +
|
  +
|
  +
|
 
|-
 
|-
 
| June 25 - July 1
 
| June 25 - July 1
 
|85%
 
|85%
  +
| 20
  +
|
  +
|
  +
|
  +
|style="background-color: green"| '''✓'''
 
|
 
|
  +
|
  +
|
  +
|
 
|-
 
|-
 
| July 2-8
 
| July 2-8
 
|86%
 
|86%
  +
|
  +
|
  +
|
  +
|
  +
| style="background-color: green"| '''✓'''
  +
|
  +
|
  +
|
 
|
 
|
 
|-
 
|-
Line 51: Line 147:
 
|88%
 
|88%
 
|
 
|
  +
|
  +
|
  +
|
  +
| style="background-color: green"| '''✓'''
  +
|
  +
|
  +
| Eval 2
  +
|
 
|-
 
|-
 
| July 16-22
 
| July 16-22
|88%
+
|89%
 
|
 
|
  +
|
  +
|
  +
|
  +
| style="background-color: green"| '''✓'''
  +
|
  +
|
  +
|
  +
|
 
|-
 
|-
 
| July 23-29
 
| July 23-29
|88%
+
|90%
 
|
 
|
  +
|
  +
|
  +
|
  +
|
  +
|
  +
|
  +
|
  +
|
 
|-
 
|-
 
| July 30 - August 5
 
| July 30 - August 5
|90%
+
|91%
 
|
 
|
  +
|
  +
|
  +
|
  +
|
  +
|
  +
|
  +
|
  +
|
 
|-
 
|-
 
| August 6-14
 
| August 6-14
 
|92%
 
|92%
 
|
 
|
  +
|
  +
|
  +
| 0
  +
|
  +
|
  +
|
  +
| Final Evals
  +
|
 
|}
 
|}
   
 
==Plan by Weeks==
 
==Plan by Weeks==
   
1. 30% coverage
+
# 45% coverage, adding new stems to bidix and monodix
  +
# 65% coverage, adding new stems to bidix and monodix
 
2. Basic CG
+
# 67% coverage, Basic CG
  +
# 70% coverage, Adding inflectional affixes to uig.lexc, writing twol rules for them
 
  +
# 75% coverage, Adding derivational affixes to uig.lexc, writing twol rules for them
3. 40% coverage
 
  +
# 78% coverage, Transfer, CG
 
  +
# 82% coverage, CG, lexsel
4. Transfer
 
  +
# 84% coverage, Transfer, lexsel
 
5. 50% coverage
+
# 85% coverage, Transfer, lexsel
  +
# 85% coverage, CG, Transfer
 
  +
# 86% coverage, Transfer, lexsel
6. Transfer, lexical selection, 65% coverage
 
  +
# 88% coverage, Transfer, CG
 
  +
# Preparing text for annotation, evaluation
7. CG, 80% coverage
 
  +
# Annotating the Uyghur corpus, %90 coverage
 
8. Transfer, lexsel, 84% coverage
+
# Annotating the Uyghur corpus, %90 coverage, Writing paper
  +
# Writing paper
 
9. Transfer
 
 
10. CG, Transfer
 
 
11. Transfer, lexsel, 86% coverage
 
 
12. Transfer, 88% coverage
 
 
13. Preparing text for annotation
 
 
14-16. Annotating the Uyghur corpus, %90 coverage
 
   
 
== Plan Outline ==
 
== Plan Outline ==

Latest revision as of 08:30, 23 July 2018

Week Cov. goal CG goal Transfer goal Lexsel goal Corpusvoc goal Done? Coverage Errors Checkpoint Comments
April 23-29 45% 5 200000 45.5 197742 Good work!
April 30 - May 6 65% 10 198000 ½ 65.6 197742 Good coverage, insufficient CG rules
May 7-13 67%
May 14-20 70%
May 21-27 75%
May 28-June 3 78% 81.42 18334
June 4-10 82% 20
June 11-17 84% Eval 1
June 18-24 85%
June 25 - July 1 85% 20
July 2-8 86%
July 9-15 88% Eval 2
July 16-22 89%
July 23-29 90%
July 30 - August 5 91%
August 6-14 92% 0 Final Evals

Plan by Weeks[edit]

  1. 45% coverage, adding new stems to bidix and monodix
  2. 65% coverage, adding new stems to bidix and monodix
  3. 67% coverage, Basic CG
  4. 70% coverage, Adding inflectional affixes to uig.lexc, writing twol rules for them
  5. 75% coverage, Adding derivational affixes to uig.lexc, writing twol rules for them
  6. 78% coverage, Transfer, CG
  7. 82% coverage, CG, lexsel
  8. 84% coverage, Transfer, lexsel
  9. 85% coverage, Transfer, lexsel
  10. 85% coverage, CG, Transfer
  11. 86% coverage, Transfer, lexsel
  12. 88% coverage, Transfer, CG
  13. Preparing text for annotation, evaluation
  14. Annotating the Uyghur corpus, %90 coverage
  15. Annotating the Uyghur corpus, %90 coverage, Writing paper
  16. Writing paper

Plan Outline[edit]

  • Post-application period:
    • Facilitating MT of a text from Uyghur to Turkish.
  • Community-bonding period:
    • bidix words, up to 50%
  • Month 1:
    • Writing scripts
    • Adding words to bidix, get coverage to around 80%
    • Chunking
    • Transfer rules
    • Begin CG for UIG
  • Month 2:
    • POS tagging/constraint grammar
    • Transfer rules
    • Get CG rules up to 100, ~50% disambiguation
    • >90% coverage
  • Month 3:
    • Creation of an Annotated Corpus