Difference between revisions of "Apertium-kaz-kir/Workplan"

From Apertium
Jump to navigation Jump to search
Line 23: Line 23:
 
!style="width: 35%"| notes
 
!style="width: 35%"| notes
 
|-
 
|-
|colspan="2" align="right"|post-application period<br />3 - 24 May
+
!colspan="2" style="text-align: right"|post-application period<br />3 - 24 May
 
|
 
|
 
# finish coding challenge with WER ~10%
 
# finish coding challenge with WER ~10%
Line 39: Line 39:
 
: —[[User:Firespeaker|Firespeaker]] 06:45, 20 May 2013 (UTC)
 
: —[[User:Firespeaker|Firespeaker]] 06:45, 20 May 2013 (UTC)
 
|-
 
|-
|colspan="2" align="right"|community bonding period<br />27 May - 16 June
+
!colspan="2" style="text-align: right"|community bonding period<br />27 May - 16 June
 
|
 
|
 
# run first testvoc
 
# run first testvoc
Line 51: Line 51:
 
|
 
|
 
|-
 
|-
  +
! 1
| 1 ||align="right"| 17 - 22 June
+
!style="text-align: right"| 17 - 22 June
 
|
 
|
 
# total 1500 stems in dix
 
# total 1500 stems in dix
Line 61: Line 62:
 
|
 
|
 
|-
 
|-
  +
! 2
| 2 ||align="right"| 23 - 29 June
+
!style="text-align: right"| 23 - 29 June
 
|
 
|
 
# total 2400 stems in dix
 
# total 2400 stems in dix
Line 70: Line 72:
 
|
 
|
 
|-
 
|-
  +
! 3
| 3 ||align="right"| 30 - 6 July
+
!style="text-align: right"| 30 - 6 July
 
|
 
|
 
# total 3200 stems in dix
 
# total 3200 stems in dix
Line 79: Line 82:
 
|
 
|
 
|-
 
|-
  +
! 4
| 4 ||align="right"| 7 - 13 July
+
!style="text-align: right"| 7 - 13 July
 
|
 
|
 
# total 4000 stems in dix
 
# total 4000 stems in dix
Line 88: Line 92:
 
|
 
|
 
|-
 
|-
  +
! 5
| 5 ||align="right"| 14 -20 July
+
!style="text-align: right"| 14 -20 July
 
|
 
|
 
# total 4800 stems in dix
 
# total 4800 stems in dix
Line 97: Line 102:
 
|
 
|
 
|-
 
|-
  +
! 6
| 6 ||align="right"| 21 - 27 July
+
!style="text-align: right"| 21 - 27 July
 
|
 
|
 
# total 5600 stems in dix
 
# total 5600 stems in dix
Line 106: Line 112:
 
|
 
|
 
|-
 
|-
  +
! 7
| 7 ||align="right"| 28 - 3 August
+
!style="text-align: right"| 28 - 3 August
 
|
 
|
 
# total 6400 stems in dix
 
# total 6400 stems in dix
Line 114: Line 121:
 
|
 
|
 
|-
 
|-
|colspan="2" align="right"| '''midterm eval<br />2 August'''
+
!colspan="2" style="text-align: right"| midterm eval<br />2 August
 
|
 
|
 
# total 6500 stems in dix
 
# total 6500 stems in dix
Line 123: Line 130:
 
|
 
|
 
|-
 
|-
  +
! 8
| 8 ||align="right"| 4 - 10 August
+
!style="text-align: right"| 4 - 10 August
 
|
 
|
 
# total 7200 stems in dix
 
# total 7200 stems in dix
Line 132: Line 140:
 
|
 
|
 
|-
 
|-
  +
! 9
| 9 ||align="right"| 11 - 17 August
+
!style="text-align: right"| 11 - 17 August
 
|
 
|
 
# total 8000 stems in dix
 
# total 8000 stems in dix
Line 140: Line 149:
 
|
 
|
 
|-
 
|-
  +
! 10
| 10 ||align="right"| 18 - 24 August
+
!style="text-align: right"| 18 - 24 August
 
|
 
|
 
# total 8800 stems in dix
 
# total 8800 stems in dix
Line 148: Line 158:
 
|
 
|
 
|-
 
|-
  +
! 11
| 11 ||align="right"| 25 - 31 August
+
!style="text-align: right"| 25 - 31 August
 
|
 
|
 
# total 9600 stems in dix
 
# total 9600 stems in dix
Line 157: Line 168:
 
|
 
|
 
|-
 
|-
  +
! 12
| 12 ||align="right"| 1 - 7 September
+
!style="text-align: right"| 1 - 7 September
 
|
 
|
 
# total 10400 stems in dix
 
# total 10400 stems in dix
Line 165: Line 177:
 
|
 
|
 
|-
 
|-
  +
! 13
| 13 ||align="right"| 8 - 15 September
+
!style="text-align: right"| 8 - 15 September
 
|
 
|
 
# total 11200 stems in dix
 
# total 11200 stems in dix
Line 173: Line 186:
 
|
 
|
 
|-
 
|-
|colspan="2" align="right"| '''pencils-down week<br />final evaluation<br />16 - 23 September'''
+
!colspan="2" style="text-align: right"| pencils-down week<br />final evaluation<br />16 - 23 September
 
|
 
|
 
# total 12000 stems in dix
 
# total 12000 stems in dix

Revision as of 06:54, 20 May 2013

Major goals

  • Good WER
  • Clean testvoc
  • 12'000 stems in bidix (~1000 stems per week, or ~200 per day)
  • Sort Adjective and Noun stems in kir.lexc into appropriate categories
  • Trimmed coverage approaching 90%

Schedule

Timeline

See GSoC 2013 Timeline for complete timeline. Important coding dates follow:

  • June 17th: coding begins
  • July 29th - August 2nd: midterm evaluations
  • September 16th - September 23rd: pencils down
  • September 27th: final evaluation

Workplan

week dates goals eval accomplishments notes
post-application period
3 - 24 May
  1. finish coding challenge with WER ~10%
  2. trimmed coverage 45%
  3. total 250 stems in dix
4/5 pass
  1. coding challenge: WER ~9%
  2. trimmed coverage: 52%,48%
  3. stems in dix: 380
  • Demonstrated ability to add stems to dix and lexc.
  • A couple easy lexical selection rules are still not written.
  • Needs to learn more about other aspects of apertium and evaluation.
Firespeaker 06:45, 20 May 2013 (UTC)
community bonding period
27 May - 16 June
  1. run first testvoc
  2. run coverage scripts
  3. get first frequency lists
  4. write ≥4 lexical selection rules
  5. write ≥2 transfer rules
  6. write ≥3 disambig rules
1 17 - 22 June
  1. total 1500 stems in dix
  2. clean testvoc for <postadv> <ij>
  3. 500-word evaluation, WER ~10%
  4. trimmed coverage 51%
2 23 - 29 June
  1. total 2400 stems in dix
  2. clean testvoc for <num> <post>
  3. trimmed coverage 53%
3 30 - 6 July
  1. total 3200 stems in dix
  2. clean testvoc for <cnjcoo> <cnjadv> <cnjsub>
  3. trimmed coverage 55%
4 7 - 13 July
  1. total 4000 stems in dix
  2. clean testvoc for <adv>
  3. trimmed coverage 59%
5 14 -20 July
  1. total 4800 stems in dix
  2. clean testvoc for <prn> <det>
  3. trimmed coverage 63%
6 21 - 27 July
  1. total 5600 stems in dix
  2. clean testvoc for <adj> <adj><advl>
  3. trimmed coverage 68%
7 28 - 3 August
  1. total 6400 stems in dix
  2. trimmed coverage 70%
midterm eval
2 August
  1. total 6500 stems in dix
  2. 500-word evaluation, WER ~10%
  3. trimmed coverage 72%
8 4 - 10 August
  1. total 7200 stems in dix
  2. clean testvoc for <n> <num><subst> <np> <adj><subst>
  3. trimmed coverage 75%
9 11 - 17 August
  1. total 8000 stems in dix
  2. trimmed coverage 78%
10 18 - 24 August
  1. total 8800 stems in dix
  2. trimmed coverage 81%
11 25 - 31 August
  1. total 9600 stems in dix
  2. clean testvoc for <v>
  3. trimmed coverage 83%
12 1 - 7 September
  1. total 10400 stems in dix
  2. trimmed coverage 85%
13 8 - 15 September
  1. total 11200 stems in dix
  2. trimmed coverage 87%
pencils-down week
final evaluation
16 - 23 September
  1. total 12000 stems in dix
  2. 500-word evaluation, WER ~10%
  3. clean testvoc for all categories
  4. trimmed coverage 88%