English and Kazakh/Work plan (GSOC 2014)
< English and Kazakh
Jump to navigation
Jump to search
Revision as of 14:59, 23 April 2014 by Francis Tyers (talk | contribs)
Corpora
Downloads
- SETimes: http://nlp.ffzg.hr/data/corpora/setimes/setimes.en-tr.txt.tgz
- EuroParl: http://www.statmt.org/europarl/v7/es-en.tgz
- NewsCommentary: http://www.statmt.org/wmt13/training-parallel-nc-v8.tgz
- Wikipedia: http://dumps.wikimedia.org/enwiki/20140402/enwiki-20140402-pages-articles.xml.bz2
Setting up corpora
Coverage targets
| Date | Corpus | Target reached |
Notes | |||
|---|---|---|---|---|---|---|
| SETimes | EuroParl | Wikipedia | NewsCommentary | |||
| 23-04-2014 | 72.90% | ? | ? | ? | √ | Initial value |
| 30-04-2014 | ||||||
| 07-05-2014 | ||||||
| 14-05-2014 | ||||||
| 21-05-2014 | Official GSOC start date | |||||
| 28-05-2014 | ||||||
| 04-06-2014 | ||||||
| 11-06-2014 | ||||||
| 18-06-2014 | ||||||
| 25-06-2014 | ||||||
| 02-07-2014 | ||||||
| 09-07-2014 | ||||||
| 16-07-2014 | ||||||
| 23-07-2014 | ||||||
| 30-07-2014 | ||||||
| 06-08-2014 | ||||||
| 13-08-2014 | ||||||
| 22-08-2014 | 90.00% | 90.00% | 90.00% | 90.00% | Final target | |