Difference between revisions of "User:Kamush/GSoC2021ProgresReport"

From Apertium
Jump to navigation Jump to search
Line 20: Line 20:
 
* Initialize kaz-uzb pair
 
* Initialize kaz-uzb pair
 
* Collect data in both languages
 
* Collect data in both languages
  +
| style = "text-align: center;" | 426
| 426
 
 
(+426)
 
(+426)
| 43.80 %
+
| style = "text-align: center;" | 43.80 %
 
| -
 
| -
 
| -
 
| -
Line 35: Line 35:
 
June 6-12
 
June 6-12
 
|Make Uzbek better
 
|Make Uzbek better
  +
| style = "text-align: center;" | 2220
| 2220
 
 
(+1794)
 
(+1794)
| 52.11 %
+
| style = "text-align: center;" | 52.11 %
 
| -
 
| -
 
| -
 
| -
Line 49: Line 49:
 
June 13-19
 
June 13-19
 
| Expand bilingual dictionary
 
| Expand bilingual dictionary
  +
| style = "text-align: center;" | 5262
| 5262
 
 
(+3042)
 
(+3042)
  +
| style = "text-align: center;" | 77.03 %
| -
 
 
| 74.77% / 67.57%
 
| 74.77% / 67.57%
 
| 64.23% / 54.37%
 
| 64.23% / 54.37%
Line 60: Line 60:
 
June 20-26
 
June 20-26
 
| More on .dix and .lrx
 
| More on .dix and .lrx
  +
| style = "text-align: center;" | 8543
| 8543
 
 
(+3281)
 
(+3281)
  +
| style = "text-align: center;" | -
| -
 
 
| 74.77% / 67.57%
 
| 74.77% / 67.57%
 
| 64.23% / 54.37%
 
| 64.23% / 54.37%
Line 72: Line 72:
 
June 27-July 3
 
June 27-July 3
 
|Focus on transfer rules
 
|Focus on transfer rules
  +
| style = "text-align: center;" | 9432
| 9432
 
 
(+889)
 
(+889)
  +
| style = "text-align: center;" | -
| -
 
 
| 74.77% / 67.57%
 
| 74.77% / 67.57%
 
| 64.23% / 54.37%
 
| 64.23% / 54.37%
Line 83: Line 83:
 
July 4-10
 
July 4-10
 
|Test translator and expand more
 
|Test translator and expand more
  +
| style = "text-align: center;" | 11008
| 11008
 
 
(+1576)
 
(+1576)
  +
| style = "text-align: center;" | 82.81%
| 82.81%
 
 
| 74.77% / 67.57%
 
| 74.77% / 67.57%
 
| 64.23% / 54.37%
 
| 64.23% / 54.37%
Line 96: Line 96:
 
July 11-17
 
July 11-17
 
|Focus more on transfer rules
 
|Focus more on transfer rules
  +
| style = "text-align: center;" | -
| -
 
  +
| style = "text-align: center;" | -
| -
 
 
| -
 
| -
 
| -
 
| -
Line 105: Line 105:
 
July 18-24
 
July 18-24
 
|Test the kaz-uzb translator
 
|Test the kaz-uzb translator
  +
| style = "text-align: center;" | -
| -
 
  +
| style = "text-align: center;" | -
| -
 
 
| -
 
| -
 
| -
 
| -
Line 114: Line 114:
 
July 25-31
 
July 25-31
 
|Focus on transfer rules
 
|Focus on transfer rules
  +
| style = "text-align: center;" | -
| -
 
  +
| style = "text-align: center;" | -
| -
 
 
| -
 
| -
 
| -
 
| -
Line 123: Line 123:
 
August 1-7
 
August 1-7
 
|Focus on testvoc
 
|Focus on testvoc
  +
| style = "text-align: center;" | -
| -
 
  +
| style = "text-align: center;" | -
| -
 
 
| -
 
| -
 
| -
 
| -
Line 132: Line 132:
 
August 8-14
 
August 8-14
 
|Finalize work
 
|Finalize work
  +
| style = "text-align: center;" | -
| -
 
  +
| style = "text-align: center;" | -
| -
 
 
| -
 
| -
 
| -
 
| -

Revision as of 08:37, 12 July 2021

Progress Report

Time Period Goal Bidix Coverage WER,PER Details/Comments
kaz-uzb kaz-uzb kaz-uzb uzb-kaz
Community Bonding Period

May 17-June 5

  • Installed Apertium
  • Initialize kaz-uzb pair
  • Collect data in both languages
426

(+426)

43.80 % - -
  • Installed Apertium and necessary tools;
  • Cloned Apertium-kaz and apertium-uzb, initialized the kaz-uzb pair
  • Translated a small sample text;
  • Extracted Uzbek and Kazakh wiki corpus;
  • Collected Kazakh-Uzbek dictionary and parallel corpora;
Week 1

June 6-12

Make Uzbek better 2220

(+1794)

52.11 % - -
  • Went through all Uzbek and Kazakh stems;
  • Initialized the pair with apertium-recursive;
  • Collected dictionaries from other pairs for crossdic;
  • Obtained crossdic results from two ways.
Week 2

June 13-19

Expand bilingual dictionary 5262

(+3042)

77.03 % 74.77% / 67.57% 64.23% / 54.37%
  • Started adding bilingual dictionary elements;
Week 3

June 20-26

More on .dix and .lrx 8543

(+3281)

- 74.77% / 67.57% 64.23% / 54.37%
  • Expanded bilingual dictionary;
  • Started sample Lexical selection rules;
Week 4

June 27-July 3

Focus on transfer rules 9432

(+889)

- 74.77% / 67.57% 64.23% / 54.37%
  • Expanded bilingual dictionary more;
Week 5

July 4-10

Test translator and expand more 11008

(+1576)

82.81% 74.77% / 67.57% 64.23% / 54.37%
  • Expanded bilingual dictionary;
  • Collected texts for lexical selection rules, tried a small script;
  • Translated a Big Kazkh text into Uzbek for better WER/PER calculation.
Week 6

July 11-17

Focus more on transfer rules - - - - -
Week 7

July 18-24

Test the kaz-uzb translator - - - - -
Week 8

July 25-31

Focus on transfer rules - - - - -
Week 9

August 1-7

Focus on testvoc - - - - -
Week 10

August 8-14

Finalize work - - - - -