Difference between revisions of "User:Kamush/GSoC2021ProgresReport"
Jump to navigation
Jump to search
Line 17: | Line 17: | ||
May 17-June 5 |
May 17-June 5 |
||
| |
| |
||
* |
* Installed Apertium |
||
* Initialize kaz-uzb pair |
* Initialize kaz-uzb pair |
||
* Collect data in both languages |
* Collect data in both languages |
||
Line 25: | Line 25: | ||
| - |
| - |
||
| |
| |
||
* |
* Installed Apertium and necessary tools; |
||
* Cloned Apertium-kaz and apertium-uzb, initialized the kaz-uzb pair |
|||
* Send the first PR that can translate a small sample text; |
|||
* Translated a small sample text; |
|||
* Extract Uzbek and Kazakh wiki corpus; |
|||
* |
* Extracted Uzbek and Kazakh wiki corpus; |
||
* |
* Collected Kazakh-Uzbek dictionary and parallel corpora; |
||
|- |
|- |
||
|Week 1 |
|Week 1 |
||
Line 39: | Line 39: | ||
| - |
| - |
||
| |
| |
||
* |
* Went through all Uzbek and Kazakh stems; |
||
* Initialized the pair with apertium-recursive; |
|||
* Clean(deduplicate) and correct uzb stems; |
|||
* Collected dictionaries from other pairs for crossdic; |
|||
* Improve Uzbek lexicon; |
|||
* Obtained crossdic results from two ways. |
|||
|- |
|- |
||
|Week 2 |
|Week 2 |
||
Line 51: | Line 52: | ||
| - |
| - |
||
| |
| |
||
* |
* Started adding bilingual dictionary elements; |
||
|- |
|- |
||
|Week 3 |
|Week 3 |
||
Line 61: | Line 62: | ||
| - |
| - |
||
| |
| |
||
* |
* Expanded bilingual dictionary; |
||
* Lexical selection rules; |
* Started sample Lexical selection rules; |
||
|- |
|- |
||
|Week 4 |
|Week 4 |
||
Line 72: | Line 73: | ||
| - |
| - |
||
| |
| |
||
* |
* Expanded bilingual dictionary more; |
||
* Lexical selection rules; |
|||
|- |
|- |
||
|Week 5 |
|Week 5 |
||
Line 83: | Line 83: | ||
| - |
| - |
||
| |
| |
||
⚫ | |||
* Test the kaz-uzb translator; |
|||
⚫ | |||
* Expand the Uzbek lexicon with missing words; |
|||
* Translated a Big Kazkh text into Uzbek for better WER/PER calculation. |
|||
⚫ | |||
* Expand lexical selection rules; |
|||
|- |
|- |
||
|Week 6 |
|Week 6 |
||
Line 95: | Line 94: | ||
| - |
| - |
||
| - |
| - |
||
| |
| - |
||
* Work more on transfer rules; |
|||
* More bilingual dictionary; |
|||
* More lexical section rules; |
|||
* |
|||
|- |
|- |
||
|Week 7 |
|Week 7 |
||
Line 108: | Line 103: | ||
| - |
| - |
||
| - |
| - |
||
| |
| - |
||
* Test the kaz-uzb translator; |
|||
* Extend the Uzbek lexicon with missing words; |
|||
* Extend the Kazakh lexicon with missing words; |
|||
* Extend bilingual dictionary; |
|||
⚫ | |||
|- |
|- |
||
|Week 8 |
|Week 8 |
||
Line 122: | Line 112: | ||
| - |
| - |
||
| - |
| - |
||
| |
| - |
||
* Add words, rules; |
|||
* Work on transfer rules; |
|||
* Start the testvoc; |
|||
|- |
|- |
||
|Week 9 |
|Week 9 |
||
Line 134: | Line 121: | ||
| - |
| - |
||
| - |
| - |
||
| |
| - |
||
* Add words, rules; |
|||
* Transfer rules kaz-uzb; |
|||
* Testvoc kaz-uzb |
|||
|- |
|- |
||
|Week 10 |
|Week 10 |
||
Line 146: | Line 130: | ||
| - |
| - |
||
| - |
| - |
||
| |
| - |
||
* Test the kaz-uzb translator; |
|||
* Check the transfer rules; |
|||
* Check the testvoc; |
|||
* Write the final report; |
|||
|- |
|- |
||
|} |
|} |
Revision as of 10:51, 11 July 2021
Progress Report
Time Period | Goal | Bidix | Coverage | WER,PER | Details/Comments | |
---|---|---|---|---|---|---|
kaz-uzb | kaz-uzb | kaz-uzb | uzb-kaz | |||
Community Bonding Period
May 17-June 5 |
|
- | - | - | - |
|
Week 1
June 6-12 |
Make Uzbek better | - | - | - | - |
|
Week 2
June 13-19 |
Expand bilingual dictionary | - | - | - | - |
|
Week 3
June 20-26 |
More on .dix and .lrx | - | - | - | - |
|
Week 4
June 27-July 3 |
Focus on transfer rules | - | - | - | - |
|
Week 5
July 4-10 |
Test translator and expand more | - | - | - | - |
|
Week 6
July 11-17 |
Focus more on transfer rules | - | - | - | - | - |
Week 7
July 18-24 |
Test the kaz-uzb translator | - | - | - | - | - |
Week 8
July 25-31 |
Focus on transfer rules | - | - | - | - | - |
Week 9
August 1-7 |
Focus on testvoc | - | - | - | - | - |
Week 10
August 8-14 |
Finalize work | - | - | - | - | - |