Difference between revisions of "RFERL corpora"

From Apertium
Jump to navigation Jump to search
Line 1: Line 1:
 
Radio Free Europe / Radio Liberty runs news services in a number of Central Asian languages. The information is essentially free for public use with attribution. {{comment|link to usage info}}
 
Radio Free Europe / Radio Liberty runs news services in a number of Central Asian languages. The information is essentially free for public use with attribution. {{comment|link to usage info}}
  +
  +
Recently we discovered [[Turkish and Kyrgyz/Making a corpus from azattyk|how to build a corpus from their website]].
   
 
== Kyrgyz ==
 
== Kyrgyz ==

Revision as of 07:07, 25 December 2011

Radio Free Europe / Radio Liberty runs news services in a number of Central Asian languages. The information is essentially free for public use with attribution.

link to usage info

Recently we discovered how to build a corpus from their website.

Kyrgyz

2009

  • Number of stems: 4.1M
  • Coverage: 87.4

2010

  • Number of stems: 3.4M
  • Coverage: 88

Kazakh

2009

2010

  • Number of stems: 3.2M
  • Coverage: 85.4