Difference between revisions of "RFERL corpora"
Jump to navigation
Jump to search
Firespeaker (talk | contribs) (→Kyrgyz) |
Firespeaker (talk | contribs) |
||
Line 1: | Line 1: | ||
Radio Free Europe / Radio Liberty runs news services in a number of Central Asian languages. The information is essentially free for public use with attribution. {{comment|link to usage info}} |
Radio Free Europe / Radio Liberty runs news services in a number of Central Asian languages. The information is essentially free for public use with attribution. {{comment|link to usage info}} |
||
Recently we discovered [[Turkish and Kyrgyz/Making a corpus from azattyk|how to build a corpus from their website]]. |
|||
== Kyrgyz == |
== Kyrgyz == |
Revision as of 07:07, 25 December 2011
Radio Free Europe / Radio Liberty runs news services in a number of Central Asian languages. The information is essentially free for public use with attribution.
link to usage info
Recently we discovered how to build a corpus from their website.
Kyrgyz
- Site: azattyk.org
- Coverage with: kymorph
2009
- Number of stems: 4.1M
- Coverage: 87.4
2010
- Number of stems: 3.4M
- Coverage: 88
Kazakh
- Site: azattyq.org
2009
- Number of stems: RFERL corpus/kk/2009/stems
- Coverage: Kazmorph/coverage/rferl2009
2010
- Number of stems: 3.2M
- Coverage: 85.4