Specific resources per language
The incubator can be found here. It provides a place for people to put dictionaries and other stuff that is useful in constructing language pairs. On this page you can put resources which will be useful in the construction. Try and mark them for licence, or at least free/non-free.
Albanian
- Dictionary: apertium-mk-sq.sq.dix
- Resources
Armenian
- Dictionary: apertium-hy-en.hy.dix
- Resources
Breton
- Dictionary: apertium-br-fr.br.dix
- Resources
- http://fr.wiktionary.org/wiki/Cat%C3%A9gorie:Grammaire_en_breton
- http://books.google.com/books?id=SQYPenZO6SUC&pg=PA1&dq=modern+breton&sig=9RjVmVzuA8iV5kzahLL_0sHaDmQ
- http://books.google.com/books?id=YYkCAAAAQAAJ&printsec=frontcover&dq=breton&num=100&as_brr=1#PPR5,M1 (public domain Breton-French dictionary and Grammar)
- http://www.preder.net/klask.php (non-free)
Cornish
- Dictionary: apertium-cy-kw.kw.dix
Bulgarian
- Dictionary: apertium-mk-bg.bg.dix
- Resources
Cornish
- Resources
Greek
- Dictionary: apertium-en-el.el.dix
- Resources
- Greek <-> Ukranian, Russian, Polish Grammar & Dictionary: http://ellinika.gnu.org.ua/
Hindi
- Dictionary: apertium-hi-en.hi.dix
- Resources
- Morphological analyser: http://www.iiit.net/ltrc/morph/index.htm (GPL)
- POS tagged English-Hindi wordlist: http://indlinux.sourceforge.net/downloads/files/hindidict.txt.bz2
Iranian Persian
- Dictionary: apertium-tg-fa.fa.dix
- Resources
Portuguese
Even if Apertium has a stable es-pt pair, the coverage of the Brazilian Portuguese Dictionary built at NILC (Universidade de Sao Paulo) for Unitex is much better, and could be used perhaps to improve it.
- Resources
We believe it has a LGPL license.
Russian
- Dictionary: monodix
- Bidix: Polish-Russian
- Bidix: English-Russian
- Resources
- http://www.alphadictionary.com/rusgrammar/
- http://www.seelrc.org:8080/grammar/pdf/stand_alone_russian.pdf
Slovakian
- Dictionary: apertium-pl-sk.sk.dix
- Resources
- http://old.bohemica.com/slovak/slovakgrammar.pdf (Slovakian, with some English)
- http://pl.wiktionary.org/wiki/Aneks:J%C4%99zyk_s%C5%82owacki_-_tabele_koniugacji (In Polish)
- http://www.angelfire.com/sk3/quality/Slovak_declension.html
Swedish - Danish
- Pair: apertium-sv-da
- Resources
- http://w3.msi.vxu.se/~nivre/research/Talbanken05.html (A 300,000-word tree-bank: it is in XML, all words are nicely tagged with PAROLE-style tags, and it should be easy to build a morphological analyser and a PoS tagger from it; authors are likely be happy to let us use it if we cite them).
- http://www.isv.cbs.dk/~mbk/treebank/ (Danish tree bank, 100,000-word, as above, under the GPL)
- http://www.ling.su.se/staff/sofia/suc/suc.html (Stockholm Umeå Corpus: 1,000,000 Swedish words, tagged; a license has to be granted by authors - it was used for apertium-sv-da)
Quechua
- Resources
- http://www.runasimipi.org/
- AVENUE Quechua-Spanish system. (ask Francis Tyers)
Norwegian
- Resources
- http://www.edd.uio.no/prosjekt/ordbanken/ -- huge word bank GPL of Norwegian (Bokmål + Nynorsk)
Urdu
- Dictionary: apertium-hi-ur.ur.dix
- Resources
- http://www.lama.univ-savoie.fr/~humayoun/UrduMorph/ — GPL analyser of Urdu
Finnish
- Resources
- http://kaino.kotus.fi/sanat/nykysuomi/ — full form list for Finnish -- LGPL
s = lemma hn = homonymy ref t = inflection info tn = inflection number (referring to table) av = ref to consonant gradation
Hebrew
- Resources
- http://www.mila.cs.technion.ac.il/english/resources/lexicons/ lexicons for Hebrew, in weird XLS format -- GPL