Ces-Rus/Workplan

From Apertium
Jump to navigation Jump to search

Timeline

Week Dates Objective Measures/targets
0 now - 30/05 Find resources for improving the bilingual dictionary. Work on expanding the bil-Dictionary's coverage. Study Czech grammar. Write Bash scripts for easy alteration, compiling, and measurement. Parse UD-Czech (Script already exists)

* Create a text corpus for various testing phases and progress measurements: 8 Basic(200 words), 4 each for each category (Wikipedia, News, chat/blogs/forums)(300 words), 6 advanced(500 words)

  • Improve bidix size to 45% clean in testvoc for all categories
1 30/05 - 04/06 Work on expanding the bilingual dictionary, and monolingual dictionaries where necessary. Create a new tagger for Czech. Test tagger.

*Improve bidix size to 60% clean in testvoc for all categories

  • Improve bidix covrage 50%
  • Achieve a WER < 20% for 1 basic text
2 05/06 - 11/06 Continue to expand bil-dictionary where needed(constant task), and work on lexical selection. Work on rudimentary transfer rules, as well as begin on transfer rules for verbs.
  • Improve bidix size to 80% clean in testvoc for all categories
  • Improve bidix coverage to 60%
  • Achieve a WER < 20% for 2 basic texts
3 12/06 - 18/06 Continue work on transfer rules for verbs.
  • Improve bidix size to 100% clean in testvoc for all categories
  • Improve bidix coverage to 65%
  • Achieve a WER < 15% for 2 basic texts
4 19/06 - 25/06 Work on specific transfer rules for prepositions and subject addition/placement. Create rules for ancillary Russian cases in the mono-dictionary (2nd prepositional(locative), partitive, ect.).
  • Improve bidix Coverage to 70%
  • Achieve a WER < 10% on 2 basic texts
5 26/06 - 02/07 Write Dictionary entries transfer rules for specific grammatical constructions such as: aby, kdyby, to... ani..., ect. Start/Finish evaluation #1.
  • Checkpoint: Measure progress of the project, and discuss the feasibility of working on Rus -> Ces. Final check on previously composed transfer rules from weeks 3-5. Test on texts, and try to "break" the translator.
  • Improve bidix coverage to 75%
  • Achieve WER < 10% on 1 basic text
  • Achieve WER < 20% on 1 advanced text
6 24/07 - 30/07 Add new and fix existing transfer rule issues identified in the previous week. Begin testing on thematic texts.
  • Improve bidix coverage to 80%
  • Achieve WER < 20% on texts from Wikipedia (4 texts)
7 03/07 - 09/07 Test the translator in these differing areas. Identify key places for improvement and begin working on them. Compile key terms for each topic.
  • Achieve WER < 20% on texts from News (4 texts)
8 10/07 - 16/07 Work on expanding the dictionaries in previously identified areas. Solve grammatical issues with transfer rules which arise in given thematic areas.
  • Improve bedix coverage to 85%
  • Achieve WER < 20% on texts from online chat/blogs/forums (4 texts)
9 17/07 - 23/07 Test performance increases in the selected topic areas. Ascertain what still needs to be improved. Work on fixes for issues. Start/Finish Evaluation #2
  • Achieve WER < 15% on texts from all categories
10 31/07 - 06/08 Test the performance of the translator as a whole. Identify problematic areas.
  • Achieve WER < 15% on 2 advanced texts
11 07/08 - 13/08 Bug fixes, correcting most problematic areas
  • Improve bidix coverage to 90%
  • Achieve WER < 10% on 2 advanced texts
12 14/08 - 20/08 Documentation (will try to do this gradually), final testing and bug fixes
  • Achieve WER < 10% on all previous advanced texts and 1 new advanced texts (6 texts)
13 08/21 - 08/29 Work on final evaluations and other bureaucratic necessities.

Done!