Difference between revisions of "User:Kamush/GSoC2021ProgresReport"

From Apertium
Jump to navigation Jump to search
Line 20: Line 20:
* Initialize kaz-uzb pair
* Initialize kaz-uzb pair
* Collect data in both languages
* Collect data in both languages
| style = "text-align: center;" | 426
| 426
(+426)
(+426)
| 43.80 %
| style = "text-align: center;" | 43.80 %
| -
| -
| -
| -
Line 35: Line 35:
June 6-12
June 6-12
|Make Uzbek better
|Make Uzbek better
| style = "text-align: center;" | 2220
| 2220
(+1794)
(+1794)
| 52.11 %
| style = "text-align: center;" | 52.11 %
| -
| -
| -
| -
Line 49: Line 49:
June 13-19
June 13-19
| Expand bilingual dictionary
| Expand bilingual dictionary
| style = "text-align: center;" | 5262
| 5262
(+3042)
(+3042)
| style = "text-align: center;" | 77.03 %
| -
| 74.77% / 67.57%
| 74.77% / 67.57%
| 64.23% / 54.37%
| 64.23% / 54.37%
Line 60: Line 60:
June 20-26
June 20-26
| More on .dix and .lrx
| More on .dix and .lrx
| style = "text-align: center;" | 8543
| 8543
(+3281)
(+3281)
| style = "text-align: center;" | -
| -
| 74.77% / 67.57%
| 74.77% / 67.57%
| 64.23% / 54.37%
| 64.23% / 54.37%
Line 72: Line 72:
June 27-July 3
June 27-July 3
|Focus on transfer rules
|Focus on transfer rules
| style = "text-align: center;" | 9432
| 9432
(+889)
(+889)
| style = "text-align: center;" | -
| -
| 74.77% / 67.57%
| 74.77% / 67.57%
| 64.23% / 54.37%
| 64.23% / 54.37%
Line 83: Line 83:
July 4-10
July 4-10
|Test translator and expand more
|Test translator and expand more
| style = "text-align: center;" | 11008
| 11008
(+1576)
(+1576)
| style = "text-align: center;" | 82.81%
| 82.81%
| 74.77% / 67.57%
| 74.77% / 67.57%
| 64.23% / 54.37%
| 64.23% / 54.37%
Line 96: Line 96:
July 11-17
July 11-17
|Focus more on transfer rules
|Focus more on transfer rules
| style = "text-align: center;" | -
| -
| style = "text-align: center;" | -
| -
| -
| -
| -
| -
Line 105: Line 105:
July 18-24
July 18-24
|Test the kaz-uzb translator
|Test the kaz-uzb translator
| style = "text-align: center;" | -
| -
| style = "text-align: center;" | -
| -
| -
| -
| -
| -
Line 114: Line 114:
July 25-31
July 25-31
|Focus on transfer rules
|Focus on transfer rules
| style = "text-align: center;" | -
| -
| style = "text-align: center;" | -
| -
| -
| -
| -
| -
Line 123: Line 123:
August 1-7
August 1-7
|Focus on testvoc
|Focus on testvoc
| style = "text-align: center;" | -
| -
| style = "text-align: center;" | -
| -
| -
| -
| -
| -
Line 132: Line 132:
August 8-14
August 8-14
|Finalize work
|Finalize work
| style = "text-align: center;" | -
| -
| style = "text-align: center;" | -
| -
| -
| -
| -
| -

Revision as of 08:37, 12 July 2021

Progress Report

Time Period Goal Bidix Coverage WER,PER Details/Comments
kaz-uzb kaz-uzb kaz-uzb uzb-kaz
Community Bonding Period

May 17-June 5

  • Installed Apertium
  • Initialize kaz-uzb pair
  • Collect data in both languages
426

(+426)

43.80 % - -
  • Installed Apertium and necessary tools;
  • Cloned Apertium-kaz and apertium-uzb, initialized the kaz-uzb pair
  • Translated a small sample text;
  • Extracted Uzbek and Kazakh wiki corpus;
  • Collected Kazakh-Uzbek dictionary and parallel corpora;
Week 1

June 6-12

Make Uzbek better 2220

(+1794)

52.11 % - -
  • Went through all Uzbek and Kazakh stems;
  • Initialized the pair with apertium-recursive;
  • Collected dictionaries from other pairs for crossdic;
  • Obtained crossdic results from two ways.
Week 2

June 13-19

Expand bilingual dictionary 5262

(+3042)

77.03 % 74.77% / 67.57% 64.23% / 54.37%
  • Started adding bilingual dictionary elements;
Week 3

June 20-26

More on .dix and .lrx 8543

(+3281)

- 74.77% / 67.57% 64.23% / 54.37%
  • Expanded bilingual dictionary;
  • Started sample Lexical selection rules;
Week 4

June 27-July 3

Focus on transfer rules 9432

(+889)

- 74.77% / 67.57% 64.23% / 54.37%
  • Expanded bilingual dictionary more;
Week 5

July 4-10

Test translator and expand more 11008

(+1576)

82.81% 74.77% / 67.57% 64.23% / 54.37%
  • Expanded bilingual dictionary;
  • Collected texts for lexical selection rules, tried a small script;
  • Translated a Big Kazkh text into Uzbek for better WER/PER calculation.
Week 6

July 11-17

Focus more on transfer rules - - - - -
Week 7

July 18-24

Test the kaz-uzb translator - - - - -
Week 8

July 25-31

Focus on transfer rules - - - - -
Week 9

August 1-7

Focus on testvoc - - - - -
Week 10

August 8-14

Finalize work - - - - -