Difference between revisions of "English and Kazakh/Work plan (GSOC 2014)"

From Apertium
Jump to navigation Jump to search
Line 1: Line 1:
  +
==Corpora==
  +
  +
===Downloads===
  +
  +
* SETimes: http://nlp.ffzg.hr/data/corpora/setimes/setimes.en-tr.txt.tgz
  +
* EuroParl: http://www.statmt.org/europarl/v7/es-en.tgz
  +
* NewsCommentary: http://www.statmt.org/wmt13/training-parallel-nc-v8.tgz
  +
* Wikipedia: http://dumps.wikimedia.org/enwiki/20140402/enwiki-20140402-pages-articles.xml.bz2
  +
  +
===Setting up corpora===
  +
 
==Coverage targets==
 
==Coverage targets==
   
Line 4: Line 15:
 
!rowspan=2| Date ||colspan=4| Corpus ||rowspan=2| Target<br/>reached||rowspan=2| Notes
 
!rowspan=2| Date ||colspan=4| Corpus ||rowspan=2| Target<br/>reached||rowspan=2| Notes
 
|-
 
|-
! Setimes !! Europarl !! Wikipedia !! NewsComment
+
! SETimes !! EuroParl !! Wikipedia !! NewsCommentary
 
|-
 
|-
 
| 23-04-2014 || 72.90% || ? || ? || ? ||align=center| √ || Initial value
 
| 23-04-2014 || 72.90% || ? || ? || ? ||align=center| √ || Initial value

Revision as of 14:50, 23 April 2014

Corpora

Downloads

Setting up corpora

Coverage targets

Date Corpus Target
reached
Notes
SETimes EuroParl Wikipedia NewsCommentary
23-04-2014 72.90% ? ? ? Initial value
30-04-2014
07-05-2014
22-08-2014 90.00% 90.00% 90.00% 90.00% Final target