Difference between revisions of "English and Kazakh/Work plan (GSOC 2014)"

From Apertium
Jump to navigation Jump to search
Line 1: Line 1:
==Corpora==

===Downloads===

* SETimes: http://nlp.ffzg.hr/data/corpora/setimes/setimes.en-tr.txt.tgz
* EuroParl: http://www.statmt.org/europarl/v7/es-en.tgz
* NewsCommentary: http://www.statmt.org/wmt13/training-parallel-nc-v8.tgz
* Wikipedia: http://dumps.wikimedia.org/enwiki/20140402/enwiki-20140402-pages-articles.xml.bz2

===Setting up corpora===

==Coverage targets==
==Coverage targets==


Line 4: Line 15:
!rowspan=2| Date ||colspan=4| Corpus ||rowspan=2| Target<br/>reached||rowspan=2| Notes
!rowspan=2| Date ||colspan=4| Corpus ||rowspan=2| Target<br/>reached||rowspan=2| Notes
|-
|-
! Setimes !! Europarl !! Wikipedia !! NewsComment
! SETimes !! EuroParl !! Wikipedia !! NewsCommentary
|-
|-
| 23-04-2014 || 72.90% || ? || ? || ? ||align=center| √ || Initial value
| 23-04-2014 || 72.90% || ? || ? || ? ||align=center| √ || Initial value

Revision as of 14:50, 23 April 2014

Corpora

Downloads

Setting up corpora

Coverage targets

Date Corpus Target
reached
Notes
SETimes EuroParl Wikipedia NewsCommentary
23-04-2014 72.90% ? ? ? Initial value
30-04-2014
07-05-2014
22-08-2014 90.00% 90.00% 90.00% 90.00% Final target