Tatar and Russian
Revision as of 16:39, 18 May 2014 by Ilnar.salimzyan (talk | contribs) (→Workplan (GSoC 2014): set up a template)
Contents |
This is a language pair translating between Tatar and Russian. The pair is currently located in nursery.
Current state
TODO: add a stats table here in the manner it was done on pages for monolingual modules. Essential things to track (following Turkic-Turkic translator page:
- testvoc (clean or not)
- trimmed coverage
- number of stems in bidix
- WER on the development corpus
- WER on unseen text(s)
Workplan (GSoC 2014)
This is a workplan for development efforts for the Tatar to Russian translator in Google Summer of Code 2014.
- Trimmed coverage means the coverage the morphological analyser after being trimmed according to the bilingual dictionary of the pair, that is, only containing stems which are also in the bilingual dictionary.
- Testvoc for a category means that the category is testvoc.
- Evaluation is taking words and performing an evaluation for post-edition word error rate (WER). The output for those words should be clean.
Week | Dates | |
---|---|---|
1 | 19/05—25/05 | |
2 | 26/05—01/06 | |
3 | 02/06—08/06 | |
4 | 09/06—15/06 | |
5 | 16/06—22/06 | |
6 | 23/06—29/06 | |
7 | 30/06—06/07 | |
8 | 07/07—13/07 | |
9 | 14/07—20/07 | |
10 | 21/07—27/07 | |
11 | 28/07—03/08 | |
12 | 04/08—10/08 | |
13 | 11/08—18/08 |