User:Kamush/GSoC2021ProgresReport

From Apertium
Jump to navigation Jump to search

Progress Report

Time Period Goal Bidix Coverage WER,PER Details/Comments
kaz-uzb kaz-uzb kaz-uzb uzb-kaz
Community Bonding Period

May 17-June 5

  • Installed Apertium
  • Initialize kaz-uzb pair
  • Collect data in both languages
- - - -
  • Installed Apertium and necessary tools;
  • Cloned Apertium-kaz and apertium-uzb, initialized the kaz-uzb pair
  • Translated a small sample text;
  • Extracted Uzbek and Kazakh wiki corpus;
  • Collected Kazakh-Uzbek dictionary and parallel corpora;
Week 1

June 6-12

Make Uzbek better - - - -
  • Went through all Uzbek and Kazakh stems;
  • Initialized the pair with apertium-recursive;
  • Collected dictionaries from other pairs for crossdic;
  • Obtained crossdic results from two ways.
Week 2

June 13-19

Expand bilingual dictionary - - - -
  • Started adding bilingual dictionary elements;
Week 3

June 20-26

More on .dix and .lrx - - - -
  • Expanded bilingual dictionary;
  • Started sample Lexical selection rules;
Week 4

June 27-July 3

Focus on transfer rules - - - -
  • Expanded bilingual dictionary more;
Week 5

July 4-10

Test translator and expand more 11008 82.81% 74.77% / 67.57% 64.23% / 54.37%
  • Expanded bilingual dictionary;
  • Collected texts for lexical selection rules, tried a small script;
  • Translated a Big Kazkh text into Uzbek for better WER/PER calculation.
Week 6

July 11-17

Focus more on transfer rules - - - - -
Week 7

July 18-24

Test the kaz-uzb translator - - - - -
Week 8

July 25-31

Focus on transfer rules - - - - -
Week 9

August 1-7

Focus on testvoc - - - - -
Week 10

August 8-14

Finalize work - - - - -