Difference between revisions of "Apertium-kaz-kir/Workplan"
Jump to navigation
Jump to search
Firespeaker (talk | contribs) |
Firespeaker (talk | contribs) |
||
(35 intermediate revisions by the same user not shown) | |||
Line 86: | Line 86: | ||
| {{Workeval5|0}} |
| {{Workeval5|0}} |
||
| |
| |
||
+ | # stems in dix: 408 |
||
| |
| |
||
* did not show up —[[User:Firespeaker|Firespeaker]] 02:28, 2 July 2013 (UTC) |
* did not show up —[[User:Firespeaker|Firespeaker]] 02:28, 2 July 2013 (UTC) |
||
Line 95: | Line 96: | ||
# clean testvoc for {{tag|cnjcoo}} {{tag|cnjadv}} {{tag|cnjsub}} |
# clean testvoc for {{tag|cnjcoo}} {{tag|cnjadv}} {{tag|cnjsub}} |
||
# trimmed coverage 55% |
# trimmed coverage 55% |
||
+ | | {{Workeval5|2}} |
||
+ | | |
||
+ | # stems in dix: 508 |
||
+ | # trimmed coverage: 59.5%,51.5% |
||
| |
| |
||
+ | * trimmed coverage good |
||
− | | |
||
+ | * too narrow a focus on a single corpus |
||
− | | |
||
+ | * number of stems too low |
||
+ | * no new WER text |
||
+ | * no testvoc |
||
+ | —[[User:Firespeaker|Firespeaker]] 20:43, 8 July 2013 (UTC) |
||
|- |
|- |
||
! 4 |
! 4 |
||
Line 105: | Line 114: | ||
# clean testvoc for {{tag|adv}} |
# clean testvoc for {{tag|adv}} |
||
# trimmed coverage 59% |
# trimmed coverage 59% |
||
+ | |{{Workeval5|2}} |
||
− | | |
||
+ | |rowspan="2"| |
||
− | | |
||
+ | # stems in dix: 2574 |
||
− | | |
||
+ | # trimmed coverage: 69.3%,63.8% |
||
+ | # azattyq_24455849 WER: 14.78% |
||
+ | # completed most of TODO-list |
||
+ | |rowspan="2"| |
||
+ | * good progress on adding stems |
||
+ | * fixed little things as directed |
||
+ | * good progress on post-editing process |
||
+ | * didn't make good progress on reducing WER |
||
+ | * still no testvoc |
||
+ | * committed once every 3 or 4 days; '''should be committing every day''' |
||
+ | * poor communication with mentors; needs to be around more often |
||
+ | —[[User:Firespeaker|Firespeaker]] 22:16, 22 July 2013 (UTC) |
||
|- |
|- |
||
! 5 |
! 5 |
||
− | !style="text-align: right"| 14 -20 July |
+ | !style="text-align: right"| 14 - 20 July |
| |
| |
||
# total 4800 stems in dix |
# total 4800 stems in dix |
||
# clean testvoc for {{tag|prn}} {{tag|det}} |
# clean testvoc for {{tag|prn}} {{tag|det}} |
||
# trimmed coverage 63% |
# trimmed coverage 63% |
||
+ | |{{Workeval5|3}} |
||
− | | |
||
− | | |
||
− | | |
||
|- |
|- |
||
! 6 |
! 6 |
||
Line 125: | Line 144: | ||
# clean testvoc for {{tag|adj}} {{tag|adj}}{{tag|advl}} |
# clean testvoc for {{tag|adj}} {{tag|adj}}{{tag|advl}} |
||
# trimmed coverage 68% |
# trimmed coverage 68% |
||
+ | |{{Workeval5|3}} |
||
− | | |
||
+ | |rowspan="3"| |
||
− | | |
||
+ | # stems in dix: 5552 |
||
− | | |
||
+ | # trimmed coverage: 72%,67% |
||
+ | # azattyq_24455849 WER: 18.01% |
||
+ | |rowspan="2"| |
||
+ | * good improvement in dix |
||
+ | ** should be checking for errors (e.g., extra spaces) |
||
+ | * not much progress with WER text |
||
+ | ** simple lrx and t1x should be enough here |
||
+ | * still no indication of progress with testvoc |
||
+ | * better communication and commit frequency, but could still improve |
||
+ | —[[User:Firespeaker|Firespeaker]] 18:21, 1 August 2013 (UTC) |
||
|- |
|- |
||
! 7 |
! 7 |
||
Line 134: | Line 163: | ||
# total 6400 stems in dix |
# total 6400 stems in dix |
||
# trimmed coverage 70% |
# trimmed coverage 70% |
||
+ | |{{Workeval5|2}} |
||
− | | |
||
− | | |
||
− | | |
||
|- |
|- |
||
− | !colspan="2" style="text-align: right"| midterm eval<br />2 August |
+ | !colspan="2" style="text-align: right"| [[Apertium-kaz-kir/TODO#By_midterm|midterm eval]]<br />2 August |
| |
| |
||
# total 6500 stems in dix |
# total 6500 stems in dix |
||
# 500-word evaluation, WER ~10% |
# 500-word evaluation, WER ~10% |
||
# trimmed coverage 72% |
# trimmed coverage 72% |
||
+ | |{{Workeval5|2}} |
||
| |
| |
||
+ | * midterm TODO list goals only partially attained |
||
− | | |
||
+ | * overall progress has been mediocre |
||
− | | |
||
+ | * among the lowest-performing students |
||
+ | * noticeable improvement in the last few weeks |
||
+ | * needs to improve more to pass the final |
||
+ | —[[User:Firespeaker|Firespeaker]] 18:26, 1 August 2013 (UTC) |
||
|- |
|- |
||
! 8 |
! 8 |
||
Line 153: | Line 185: | ||
# clean testvoc for {{tag|n}} {{tag|num}}{{tag|subst}} {{tag|np}} {{tag|adj}}{{tag|subst}} |
# clean testvoc for {{tag|n}} {{tag|num}}{{tag|subst}} {{tag|np}} {{tag|adj}}{{tag|subst}} |
||
# trimmed coverage 75% |
# trimmed coverage 75% |
||
+ | |{{Workeval5|2}} |
||
− | | |
||
+ | |rowspan="3"| |
||
− | | |
||
+ | # stems in dix: 6493 |
||
+ | # trimmed coverage: 79.6%,74.1% |
||
| |
| |
||
|- |
|- |
||
Line 162: | Line 196: | ||
# total 8000 stems in dix |
# total 8000 stems in dix |
||
# trimmed coverage 78% |
# trimmed coverage 78% |
||
+ | |{{Workeval5|2}} |
||
− | | |
||
− | | |
||
| |
| |
||
|- |
|- |
||
Line 171: | Line 204: | ||
# total 8800 stems in dix |
# total 8800 stems in dix |
||
# trimmed coverage 81% |
# trimmed coverage 81% |
||
+ | |{{Workeval5|3}} |
||
− | | |
||
− | | |
||
| |
| |
||
|- |
|- |
||
Line 181: | Line 213: | ||
# clean testvoc for {{tag|v}} |
# clean testvoc for {{tag|v}} |
||
# trimmed coverage 83% |
# trimmed coverage 83% |
||
+ | |{{Workeval5|3}} |
||
| |
| |
||
+ | # stems in dix: 6730 |
||
− | | |
||
+ | # trimmed coverage: 82.5%,78.4% |
||
+ | # azattyq_24455849 WER: 6.62% |
||
| |
| |
||
|- |
|- |
||
Line 190: | Line 225: | ||
# total 10400 stems in dix |
# total 10400 stems in dix |
||
# trimmed coverage 85% |
# trimmed coverage 85% |
||
+ | |{{Workeval5|3}} |
||
| |
| |
||
+ | # stems in dix: 7007 |
||
+ | # trimmed coverage: 84.2%,79.8% |
||
| |
| |
||
+ | * Good [[Turkic_lexicon#Kyrgyz|adjective typology]] |
||
− | | |
||
+ | * Decent progress on coverage |
||
+ | * Not around much later in the week |
||
+ | * Still no testvoc... |
||
+ | —[[User:Firespeaker|Firespeaker]] 07:29, 10 September 2013 (UTC) |
||
|- |
|- |
||
! 13 |
! 13 |
||
Line 199: | Line 241: | ||
# total 11200 stems in dix |
# total 11200 stems in dix |
||
# trimmed coverage 87% |
# trimmed coverage 87% |
||
+ | |{{Workeval5|1}} |
||
| |
| |
||
+ | # stems in dix: 7454 |
||
+ | # trimmed coverage: 85.2%,80.4% |
||
| |
| |
||
+ | * Decent increase in coverage |
||
− | | |
||
+ | * Still no testvoc |
||
+ | * Still ~600 unsorted ADJ |
||
+ | * Not around much |
||
+ | —[[User:Firespeaker|Firespeaker]] 20:06, 22 September 2013 (UTC) |
||
|- |
|- |
||
!colspan="2" style="text-align: right"| pencils-down week<br />final evaluation<br />16 - 23 September |
!colspan="2" style="text-align: right"| pencils-down week<br />final evaluation<br />16 - 23 September |
||
Line 210: | Line 259: | ||
# trimmed coverage 88% |
# trimmed coverage 88% |
||
# release 0.1.0 and move to trunk |
# release 0.1.0 and move to trunk |
||
+ | | |
||
+ | | |
||
+ | # stems in dix: 7546 |
||
+ | # trimmed coverage: 85.8%,81.6% |
||
+ | | |
||
+ | * Good coverage |
||
+ | * "Good" WER results |
||
+ | ** But lots of # and * errors :( |
||
+ | * No work on testvoc |
||
+ | * Some ADJ sorted; still >500 unsorted |
||
+ | * only 2 sets of LRX rules since early in GSoC |
||
+ | * only 1 transfer rule since early in GSoC |
||
+ | |- |
||
+ | !colspan="2" style="text-align: right"| Final evaluation |
||
+ | | |
||
| |
| |
||
| |
| |
||
| |
| |
||
+ | * Has improved coverage a certain amount |
||
+ | * Has not done anything else |
||
+ | * Mentors have had to nag to get him to work |
||
+ | * Has not been around enough |
||
+ | * Among the lowest-performing students |
||
+ | * Has not improved since midterm |
||
+ | * Last-ditch efforts not at all impressive |
||
|} |
|} |
||
Latest revision as of 06:42, 23 September 2013
Contents
Major goals[edit]
- Good WER
- Clean testvoc
- 12'000 stems in bidix (~1000 stems per week, or ~200 per day)
- Sort Adjective and Noun stems in kir.lexc into appropriate categories
- Trimmed coverage approaching 90%
Schedule[edit]
Timeline[edit]
See GSoC 2013 Timeline for complete timeline. Important coding dates follow:
- June 17th: coding begins
- July 29th - August 2nd: midterm evaluations
- September 16th - September 23rd: pencils down
- September 27th: final evaluation
Workplan[edit]
week | dates | goals | eval | accomplishments | notes |
---|---|---|---|---|---|
post-application period 3 - 24 May |
|
|
| ||
community bonding period 27 May - 16 June |
note: should be in IRC every day |
|
—Firespeaker 02:28, 2 July 2013 (UTC) | ||
1 | 17 - 22 June |
|
| ||
2 | 23 - 29 June |
|
|
| |
3 | 30 - 6 July |
|
|
—Firespeaker 20:43, 8 July 2013 (UTC) | |
4 | 7 - 13 July |
|
|
—Firespeaker 22:16, 22 July 2013 (UTC) | |
5 | 14 - 20 July |
|
|||
6 | 21 - 27 July |
|
|
—Firespeaker 18:21, 1 August 2013 (UTC) | |
7 | 28 - 3 August |
|
|||
midterm eval 2 August |
|
—Firespeaker 18:26, 1 August 2013 (UTC) | |||
8 | 4 - 10 August |
|
|
||
9 | 11 - 17 August |
|
|||
10 | 18 - 24 August |
|
|||
11 | 25 - 31 August |
|
|
||
12 | 1 - 7 September |
|
|
—Firespeaker 07:29, 10 September 2013 (UTC) | |
13 | 8 - 15 September |
|
|
—Firespeaker 20:06, 22 September 2013 (UTC) | |
pencils-down week final evaluation 16 - 23 September |
|
|
| ||
Final evaluation |
|
Tips and Tricks[edit]
Adding stems quickly[edit]
- Add top stems from frequency lists of unknown forms
- Use spectie's dix-entries-to-be-checked script