Difference between revisions of "Corpora"

From Apertium
Jump to navigation Jump to search
Line 7: Line 7:
* Southeast European Times — http://xixona.dlsi.ua.es/~fran/setimes/ — English,Turkish,Bulgarian,Macedonian,Serbo-Croatian,Albanian,Greek,Romanian — 9,000 approx. paragraph aligned, 90,000—120,000 words.
* Southeast European Times — http://xixona.dlsi.ua.es/~fran/setimes/ — English,Turkish,Bulgarian,Macedonian,Serbo-Croatian,Albanian,Greek,Romanian — 9,000 approx. paragraph aligned, 90,000—120,000 words.
* South African Government Services — http://xixona.dlsi.ua.es/~fran/services-gov-za-en_ZA-af_ZA.txt — English—Afrikaans — 2,500 approx. sentence aligned, 49,375 words.
* South African Government Services — http://xixona.dlsi.ua.es/~fran/services-gov-za-en_ZA-af_ZA.txt — English—Afrikaans — 2,500 approx. sentence aligned, 49,375 words.
* IJS-ELAN — http://nl.ijs.si/elan/ — English-Slovenian


[[Category:Resources]]
[[Category:Resources]]

Revision as of 19:06, 28 January 2008

Lists of corpora under free licences (public domain, CC-BY-SA, GPL, etc.)

Corpora