Difference between revisions of "User:Kamush/GSoC2021ProgresReport"

From Apertium
Jump to navigation Jump to search
Line 17: Line 17:
May 17-June 5
May 17-June 5
|
|
* Installing Apertium
* Installed Apertium
* Initialize kaz-uzb pair
* Initialize kaz-uzb pair
* Collect data in both languages
* Collect data in both languages
Line 25: Line 25:
| -
| -
|
|
* Installing Apertium and necessary tools;
* Installed Apertium and necessary tools;
* Cloned Apertium-kaz and apertium-uzb, initialized the kaz-uzb pair
* Send the first PR that can translate a small sample text;
* Translated a small sample text;
* Extract Uzbek and Kazakh wiki corpus;
* Collect Uzbek and Kazakh web(non-wiki) corpus;
* Extracted Uzbek and Kazakh wiki corpus;
* Collect Kazakh-Uzbek dictionary and parallel corpora;
* Collected Kazakh-Uzbek dictionary and parallel corpora;
|-
|-
|Week 1
|Week 1
Line 39: Line 39:
| -
| -
|
|
* Go through all Uzbek stems in uzb.lexc;
* Went through all Uzbek and Kazakh stems;
* Initialized the pair with apertium-recursive;
* Clean(deduplicate) and correct uzb stems;
* Collected dictionaries from other pairs for crossdic;
* Improve Uzbek lexicon;
* Obtained crossdic results from two ways.
|-
|-
|Week 2
|Week 2
Line 51: Line 52:
| -
| -
|
|
* Start adding bilingual dictionary elements;
* Started adding bilingual dictionary elements;
|-
|-
|Week 3
|Week 3
Line 61: Line 62:
| -
| -
|
|
* Expand bilingual dictionary;
* Expanded bilingual dictionary;
* Lexical selection rules;
* Started sample Lexical selection rules;
|-
|-
|Week 4
|Week 4
Line 72: Line 73:
| -
| -
|
|
* Expand bilingual dictionary;
* Expanded bilingual dictionary more;
* Lexical selection rules;
|-
|-
|Week 5
|Week 5
Line 83: Line 83:
| -
| -
|
|
* Expanded bilingual dictionary;
* Test the kaz-uzb translator;
* Collected texts for lexical selection rules, tried a small script;
* Expand the Uzbek lexicon with missing words;
* Translated a Big Kazkh text into Uzbek for better WER/PER calculation.
* Expand bilingual dictionary;
* Expand lexical selection rules;
|-
|-
|Week 6
|Week 6
Line 95: Line 94:
| -
| -
| -
| -
|
| -
* Work more on transfer rules;
* More bilingual dictionary;
* More lexical section rules;
*
|-
|-
|Week 7
|Week 7
Line 108: Line 103:
| -
| -
| -
| -
|
| -
* Test the kaz-uzb translator;
* Extend the Uzbek lexicon with missing words;
* Extend the Kazakh lexicon with missing words;
* Extend bilingual dictionary;
* Add more lexical selection rules;
|-
|-
|Week 8
|Week 8
Line 122: Line 112:
| -
| -
| -
| -
|
| -
* Add words, rules;
* Work on transfer rules;
* Start the testvoc;
|-
|-
|Week 9
|Week 9
Line 134: Line 121:
| -
| -
| -
| -
|
| -
* Add words, rules;
* Transfer rules kaz-uzb;
* Testvoc kaz-uzb
|-
|-
|Week 10
|Week 10
Line 146: Line 130:
| -
| -
| -
| -
|
| -
* Test the kaz-uzb translator;
* Check the transfer rules;
* Check the testvoc;
* Write the final report;
|-
|-
|}
|}

Revision as of 10:51, 11 July 2021

Progress Report

Time Period Goal Bidix Coverage WER,PER Details/Comments
kaz-uzb kaz-uzb kaz-uzb uzb-kaz
Community Bonding Period

May 17-June 5

  • Installed Apertium
  • Initialize kaz-uzb pair
  • Collect data in both languages
- - - -
  • Installed Apertium and necessary tools;
  • Cloned Apertium-kaz and apertium-uzb, initialized the kaz-uzb pair
  • Translated a small sample text;
  • Extracted Uzbek and Kazakh wiki corpus;
  • Collected Kazakh-Uzbek dictionary and parallel corpora;
Week 1

June 6-12

Make Uzbek better - - - -
  • Went through all Uzbek and Kazakh stems;
  • Initialized the pair with apertium-recursive;
  • Collected dictionaries from other pairs for crossdic;
  • Obtained crossdic results from two ways.
Week 2

June 13-19

Expand bilingual dictionary - - - -
  • Started adding bilingual dictionary elements;
Week 3

June 20-26

More on .dix and .lrx - - - -
  • Expanded bilingual dictionary;
  • Started sample Lexical selection rules;
Week 4

June 27-July 3

Focus on transfer rules - - - -
  • Expanded bilingual dictionary more;
Week 5

July 4-10

Test translator and expand more - - - -
  • Expanded bilingual dictionary;
  • Collected texts for lexical selection rules, tried a small script;
  • Translated a Big Kazkh text into Uzbek for better WER/PER calculation.
Week 6

July 11-17

Focus more on transfer rules - - - - -
Week 7

July 18-24

Test the kaz-uzb translator - - - - -
Week 8

July 25-31

Focus on transfer rules - - - - -
Week 9

August 1-7

Focus on testvoc - - - - -
Week 10

August 8-14

Finalize work - - - - -