Difference between revisions of "Specific resources per language"
Line 6: | Line 6: | ||
Here are some links to resources that might be useful for expanding on work in the Incubator. Below you can put resources which will be useful in the construction. Try and mark them for licence, or at least free/non-free. |
Here are some links to resources that might be useful for expanding on work in the Incubator. Below you can put resources which will be useful in the construction. Try and mark them for licence, or at least free/non-free. |
||
See also the individual language pages. |
|||
⚫ | |||
⚫ | |||
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-mk-sq.sq.dix apertium-mk-sq.sq.dix]'' |
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-mk-sq.sq.dix apertium-mk-sq.sq.dix]'' |
||
Line 13: | Line 15: | ||
* http://www.idividi.com.mk/recnik/index.htm -- albanian--macedonian dictionary (non-free) |
* http://www.idividi.com.mk/recnik/index.htm -- albanian--macedonian dictionary (non-free) |
||
===Armenian=== |
===[[Armenian]]=== |
||
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-hy-en.hy.dix apertium-hy-en.hy.dix]'' |
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-hy-en.hy.dix apertium-hy-en.hy.dix]'' |
||
Line 20: | Line 22: | ||
* http://www.armeniapedia.org/index.php?title=Category:Armenian_Language_Lessons |
* http://www.armeniapedia.org/index.php?title=Category:Armenian_Language_Lessons |
||
===Assamese |
===[[Assamese and Hindi]]=== |
||
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-as-hi.as.dix apertium-as-hi.hi.dix apertium-as-hi.as-hi.dix apertium-as-hi.trules.xml]'' |
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-as-hi.as.dix apertium-as-hi.hi.dix apertium-as-hi.as-hi.dix apertium-as-hi.trules.xml]'' |
||
--- Anusuya |
--- Anusuya |
||
===Belarusian=== |
===[[Belarusian]]=== |
||
* [http://www.vitba.org/fofmb/fofmb.html GFDL grammar of the language] |
* [http://www.vitba.org/fofmb/fofmb.html GFDL grammar of the language] |
||
===Bengali=== |
===[[Bengali]]=== |
||
* http://bengalinux.sourceforge.net/cgi-bin/anubadok/index.pl -- Free software translation for English→Bengali |
* http://bengalinux.sourceforge.net/cgi-bin/anubadok/index.pl -- Free software translation for English→Bengali |
||
* http://anubadok.sf.net/ -- See above |
* http://anubadok.sf.net/ -- See above |
||
===Bulgarian=== |
===[[Bulgarian]]=== |
||
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-mk-bg.bg.dix apertium-mk-bg.bg.dix]'' |
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-mk-bg.bg.dix apertium-mk-bg.bg.dix]'' |
||
Line 41: | Line 43: | ||
* [http://www.sfs.nphil.uni-tuebingen.de/iscl/Theses/zhechev.pdf Bulgarian verbal morphology] |
* [http://www.sfs.nphil.uni-tuebingen.de/iscl/Theses/zhechev.pdf Bulgarian verbal morphology] |
||
===Cornish=== |
===[[Cornish]]=== |
||
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-cy-kw.kw.dix apertium-cy-kw.kw.dix]'' |
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-cy-kw.kw.dix apertium-cy-kw.kw.dix]'' |
||
Line 49: | Line 51: | ||
* [http://kevindonnelly.org.uk/kernewek/ Cornish-Welsh bilingual wordlist] |
* [http://kevindonnelly.org.uk/kernewek/ Cornish-Welsh bilingual wordlist] |
||
===Czech=== |
===[[Czech]]=== |
||
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-pl-cs.cs.dix.xml apertium-pl-cs.cs.dix.xml]'' |
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-pl-cs.cs.dix.xml apertium-pl-cs.cs.dix.xml]'' |
||
;Resources |
;Resources |
||
Line 58: | Line 60: | ||
* [http://ufal.mff.cuni.cz/pdt/Morphology_and_Tagging/Morphology/index.html Czech morphological guesser] - 'free', but not open source |
* [http://ufal.mff.cuni.cz/pdt/Morphology_and_Tagging/Morphology/index.html Czech morphological guesser] - 'free', but not open source |
||
===Faroese=== |
===[[Faroese]]=== |
||
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-fo-is.fo.dix apertium-fo-is.fo.dix]'' |
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-fo-is.fo.dix apertium-fo-is.fo.dix]'' |
||
Line 65: | Line 67: | ||
* [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-fo-is.fo.rle Faroese Constraint Grammar] |
* [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-fo-is.fo.rle Faroese Constraint Grammar] |
||
===Finnish=== |
===[[Finnish]]=== |
||
{{see-also|Omorfi}} |
{{see-also|Omorfi}} |
||
;Resources |
;Resources |
||
Line 80: | Line 82: | ||
</pre> |
</pre> |
||
===German |
===[[German and English]]=== |
||
German-English bilingual dictionary (>216,000 entries), generated from linguistic data (GPL Version 2 or later) available for [http://www-user.tu-chemnitz.de/~fri/ding/ "Ding: A Dictionary LookUp program"] (version 1.5 2007-04-09) from Frank Richter, [http://tu-chemnitz.de Technische Universität Chemnitz] |
German-English bilingual dictionary (>216,000 entries), generated from linguistic data (GPL Version 2 or later) available for [http://www-user.tu-chemnitz.de/~fri/ding/ "Ding: A Dictionary LookUp program"] (version 1.5 2007-04-09) from Frank Richter, [http://tu-chemnitz.de Technische Universität Chemnitz] |
||
Line 86: | Line 88: | ||
:''Dictionary: [https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-de-en apertium-de-en.dix]'' |
:''Dictionary: [https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-de-en apertium-de-en.dix]'' |
||
===Greek=== |
===[[Greek]]=== |
||
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-en-el.el.dix apertium-en-el.el.dix] |
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-en-el.el.dix apertium-en-el.el.dix] |
||
Line 93: | Line 95: | ||
* Greek <-> Ukranian, Russian, Polish Grammar & Dictionary: http://ellinika.gnu.org.ua/ |
* Greek <-> Ukranian, Russian, Polish Grammar & Dictionary: http://ellinika.gnu.org.ua/ |
||
===Hebrew=== |
===[[Hebrew]]=== |
||
;Resources |
;Resources |
||
Line 104: | Line 106: | ||
* http://www.code972.com/blog/hebmorph/ HebMorph is the analyser powering hspell's capabilities -- GPL |
* http://www.code972.com/blog/hebmorph/ HebMorph is the analyser powering hspell's capabilities -- GPL |
||
===Hindi=== |
===[[Hindi]]=== |
||
{{see-also|Hindi}} |
{{see-also|Hindi}} |
||
Line 118: | Line 120: | ||
* https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-hi-ur.hi.dix.old |
* https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-hi-ur.hi.dix.old |
||
===Iranian Persian=== |
===[[Iranian Persian]]=== |
||
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-tg-fa.fa.dix apertium-tg-fa.fa.dix]'' |
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-tg-fa.fa.dix apertium-tg-fa.fa.dix]'' |
||
Line 125: | Line 127: | ||
* [http://books.google.com/books?vid=OCLC20216670&id=Ru1ncSqiRXkC&printsec=titlepage&hl=de#PPA24,M1 Grammar of Persian] |
* [http://books.google.com/books?vid=OCLC20216670&id=Ru1ncSqiRXkC&printsec=titlepage&hl=de#PPA24,M1 Grammar of Persian] |
||
===Ingush=== |
===[[Ingush]]=== |
||
; Resources |
; Resources |
||
Line 132: | Line 134: | ||
* [http://books.google.com/books?id=J7wqVHeRWdwC&pg=PA5&lpg=PA5&dq=ingush+father&source=bl&ots=N8TDZudzGZ&sig=JO9X_Y9gio7dUhZWeyZX7j17iPw&hl=ca&ei=vfq4TM6CH86OjAfO94XaDg&sa=X&oi=book_result&ct=result&resnum=3&ved=0CB8Q6AEwAg#v=onepage&q=ingush%20father&f=false Ingush-English dict] (non-free) |
* [http://books.google.com/books?id=J7wqVHeRWdwC&pg=PA5&lpg=PA5&dq=ingush+father&source=bl&ots=N8TDZudzGZ&sig=JO9X_Y9gio7dUhZWeyZX7j17iPw&hl=ca&ei=vfq4TM6CH86OjAfO94XaDg&sa=X&oi=book_result&ct=result&resnum=3&ved=0CB8Q6AEwAg#v=onepage&q=ingush%20father&f=false Ingush-English dict] (non-free) |
||
===Lithuanian=== |
===[[Lithuanian]]=== |
||
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-en-lt.lt.dix apertium-en-lt.lt.dix]'' |
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-en-lt.lt.dix apertium-en-lt.lt.dix]'' |
||
;Resources |
;Resources |
||
===Nogai=== |
===[[Nogai]]=== |
||
; Resources |
; Resources |
||
Line 143: | Line 145: | ||
* [http://ksirov.ru/%D1%8F%D0%B7%D1%8B%D0%BA%D0%B8/%D0%BD%D0%BE%D0%B3%D0%B0%D0%B9%D1%81%D0%BA%D0%B8%D0%B9 Grammar Sketch and Russian-Nogai dictionary] |
* [http://ksirov.ru/%D1%8F%D0%B7%D1%8B%D0%BA%D0%B8/%D0%BD%D0%BE%D0%B3%D0%B0%D0%B9%D1%81%D0%BA%D0%B8%D0%B9 Grammar Sketch and Russian-Nogai dictionary] |
||
===Ossetian=== |
===[[Ossetian]]=== |
||
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-os-fa.os.dix apertium-os-fa.os.dix]'' |
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-os-fa.os.dix apertium-os-fa.os.dix]'' |
||
Line 151: | Line 153: | ||
* [http://www.ossetic-studies.org/ Ossetic National Corpus] |
* [http://www.ossetic-studies.org/ Ossetic National Corpus] |
||
===Piemontese=== |
===[[Piemontese]]=== |
||
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-it-pms.pms.dix apertium-it-pms.pms.dix]'' |
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-it-pms.pms.dix apertium-it-pms.pms.dix]'' |
||
;Resources |
;Resources |
||
Line 158: | Line 160: | ||
* http://digilander.libero.it/dotor43/indexit.html -- Piemontese grammar incl. 17k word Piemontese--Italian dictionary (POS tagged and partly annotated for inflection). site suggests "© These pages can be freely used for all purposes, but not for political reasons, and not against the laws (no matter what is the country)." |
* http://digilander.libero.it/dotor43/indexit.html -- Piemontese grammar incl. 17k word Piemontese--Italian dictionary (POS tagged and partly annotated for inflection). site suggests "© These pages can be freely used for all purposes, but not for political reasons, and not against the laws (no matter what is the country)." |
||
===Portuguese=== |
===[[Portuguese]]=== |
||
Even if Apertium has a stable es-pt pair, the coverage of the Brazilian Portuguese Dictionary built at NILC (Universidade de Sao Paulo) for Unitex is much better, and could be used perhaps to improve it. |
Even if Apertium has a stable es-pt pair, the coverage of the Brazilian Portuguese Dictionary built at NILC (Universidade de Sao Paulo) for Unitex is much better, and could be used perhaps to improve it. |
||
Line 168: | Line 170: | ||
We believe it has a LGPL license. |
We believe it has a LGPL license. |
||
===Punjabi=== |
===[[Punjabi]]=== |
||
; Resources |
; Resources |
||
Line 174: | Line 176: | ||
* [http://www.lama.univ-savoie.fr/~humayoun/punjabi/index.html Punjabi lexicon] |
* [http://www.lama.univ-savoie.fr/~humayoun/punjabi/index.html Punjabi lexicon] |
||
===Quechua=== |
===[[Quechua]]=== |
||
;Resources |
;Resources |
||
Line 181: | Line 183: | ||
* AVENUE Quechua-Spanish system. (ask [[User:Francis Tyers|Francis Tyers]]) |
* AVENUE Quechua-Spanish system. (ask [[User:Francis Tyers|Francis Tyers]]) |
||
===Russian=== |
===[[Russian]]=== |
||
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-pl-ru.ru.dix.xml monodix]'' |
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-pl-ru.ru.dix.xml monodix]'' |
||
Line 197: | Line 199: | ||
* [http://www.lugattj.com/news.php?tid=1&ln=en Another Tajik--Russian dictionary] |
* [http://www.lugattj.com/news.php?tid=1&ln=en Another Tajik--Russian dictionary] |
||
===Sanskrit '''संस्कृतम्'''=== |
===[[Sanskrit]] '''संस्कृतम्'''=== |
||
:''Dictionary: [https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-sa-XX apertium-sa-XX] |
:''Dictionary: [https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-sa-XX apertium-sa-XX] |
||
Line 205: | Line 207: | ||
* [http://www.sanskrit-lexicon.uni-koeln.de/download.html Material available for download]. |
* [http://www.sanskrit-lexicon.uni-koeln.de/download.html Material available for download]. |
||
===Slovakian=== |
===[[Slovakian]]=== |
||
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-pl-sk.sk.dix apertium-pl-sk.sk.dix]'' |
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-pl-sk.sk.dix apertium-pl-sk.sk.dix]'' |
||
Line 215: | Line 217: | ||
* http://www.juls.savba.sk/msj/ |
* http://www.juls.savba.sk/msj/ |
||
===Urdu=== |
===[[Urdu]]=== |
||
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-hi-ur.ur.dix apertium-hi-ur.ur.dix]'' |
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-hi-ur.ur.dix apertium-hi-ur.ur.dix]'' |
||
Revision as of 06:28, 14 February 2014
The incubator can be found here. It provides a place for people to put dictionaries and other stuff that is useful in constructing language pairs.
Specific resources per language
Here are some links to resources that might be useful for expanding on work in the Incubator. Below you can put resources which will be useful in the construction. Try and mark them for licence, or at least free/non-free.
See also the individual language pages.
Albanian
- Dictionary: apertium-mk-sq.sq.dix
- Resources
- http://www.albanianoverview.com/grammar.htm
- http://www.idividi.com.mk/recnik/index.htm -- albanian--macedonian dictionary (non-free)
Armenian
- Dictionary: apertium-hy-en.hy.dix
- Resources
Assamese and Hindi
--- Anusuya
Belarusian
Bengali
- http://bengalinux.sourceforge.net/cgi-bin/anubadok/index.pl -- Free software translation for English→Bengali
- http://anubadok.sf.net/ -- See above
Bulgarian
- Dictionary: apertium-mk-bg.bg.dix
- Resources
Cornish
- Dictionary: apertium-cy-kw.kw.dix
- Resources
Czech
- Dictionary: apertium-pl-cs.cs.dix.xml
- Resources
- Most frequent words Also includes a list of the most frequent bi- and tri-grams, but these are of little use as multiwords
- James Naughton's links
- Some complications with diacritics
- Czech morphological guesser - 'free', but not open source
Faroese
- Dictionary: apertium-fo-is.fo.dix
- Resources
Finnish
- See also: Omorfi
- Resources
- http://kaino.kotus.fi/sanat/nykysuomi/ — full form list for Finnish -- LGPL
- Omorfi–Open Morphology for Finnish language
- Helsinki Finite-State Transducer Technology (HFST)
s = lemma hn = homonymy ref t = inflection info tn = inflection number (referring to table) av = ref to consonant gradation
German and English
German-English bilingual dictionary (>216,000 entries), generated from linguistic data (GPL Version 2 or later) available for "Ding: A Dictionary LookUp program" (version 1.5 2007-04-09) from Frank Richter, Technische Universität Chemnitz
- Dictionary: apertium-de-en.dix
Greek
- Dictionary: apertium-en-el.el.dix
- Resources
- Greek <-> Ukranian, Russian, Polish Grammar & Dictionary: http://ellinika.gnu.org.ua/
Hebrew
- Resources
- http://www.mila.cs.technion.ac.il/english/resources/lexicons/ lexicons for Hebrew, in weird XLS format -- GPL
- http://www.mila.cs.technion.ac.il/english/resources/software_downloads/index.html Hebrew Morphological Analyzer (for Hebrew undotted text) -- GPL, but download link behind a password
- http://www.cs.technion.ac.il/~barhaim/MorphTagger/ HMM-based part-of-speech tagger For Hebrew -- GPL
- http://www.cs.technion.ac.il/~erelsgl/bxi/hmntx/teud.html Probabilisitic Morphological Analyzer for Hebrew undotted text -- license unknown
- http://hspell.ivrix.org.il/ The hspell Hebrew spell-checker has a mode for analyzing morpholocial data -- GPL
- http://www.code972.com/blog/hebmorph/ HebMorph is the analyser powering hspell's capabilities -- GPL
Hindi
- See also: Hindi
- Resources
- POS tagged English-Hindi wordlist: http://indlinux.sourceforge.net/downloads/files/hindidict.txt.bz2
- https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-en-hi
- https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-hi-en-unicode
- https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-hi.hi.dix
- https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-hi.hi_WX.dix
- https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-hi-ur.hi.dix
- https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-hi-ur.hi.dix.old
Iranian Persian
- Dictionary: apertium-tg-fa.fa.dix
- Resources
Ingush
- Resources
- Lexical database (non-free)
- Ingush-English dict (non-free)
Lithuanian
- Dictionary: apertium-en-lt.lt.dix
- Resources
Nogai
- Resources
Ossetian
- Dictionary: apertium-os-fa.os.dix
- Resources
- Ossetian: Grammatical Sketch — quite nice and comprehensive.
- Ossetic National Corpus
Piemontese
- Dictionary: apertium-it-pms.pms.dix
- Resources
- http://members.fortunecity.it/dotorcarlo/vocen1.html Piemontese--English -- public domain
- http://digilander.libero.it/dotor43/indexit.html -- Piemontese grammar incl. 17k word Piemontese--Italian dictionary (POS tagged and partly annotated for inflection). site suggests "© These pages can be freely used for all purposes, but not for political reasons, and not against the laws (no matter what is the country)."
Portuguese
Even if Apertium has a stable es-pt pair, the coverage of the Brazilian Portuguese Dictionary built at NILC (Universidade de Sao Paulo) for Unitex is much better, and could be used perhaps to improve it.
- Resources
We believe it has a LGPL license.
Punjabi
- Resources
Quechua
- Resources
- http://www.runasimipi.org/
- AVENUE Quechua-Spanish system. (ask Francis Tyers)
Russian
- Dictionary: monodix
- Bidix: Polish-Russian
- Bidix: English-Russian
- Resources
- http://www.alphadictionary.com/rusgrammar/
- http://www.seelrc.org:8080/grammar/pdf/stand_alone_russian.pdf
- Russian analyser - non-free, Windows only
- Using Czech resources for the morphological analysis of Russian
- Pere - free translator, including Russian<->Ukranian<->English dictionaries. Built from alignments, low quality.
- Russian--Tajik phrase dictionary, 41k entries.
- Another Tajik--Russian dictionary
Sanskrit संस्कृतम्
- Dictionary: apertium-sa-XX
- Resources
Slovakian
- Dictionary: apertium-pl-sk.sk.dix
- Resources
- http://old.bohemica.com/slovak/slovakgrammar.pdf (Slovakian, with some English)
- http://pl.wiktionary.org/wiki/Aneks:J%C4%99zyk_s%C5%82owacki_-_tabele_koniugacji (In Polish)
- http://www.angelfire.com/sk3/quality/Slovak_declension.html
- http://www.juls.savba.sk/msj/
Urdu
- Dictionary: apertium-hi-ur.ur.dix
- Resources
- http://www.lama.univ-savoie.fr/~humayoun/UrduMorph/ — GPL analyser of Urdu
- http://www.crulp.org/software/langproc/E2UMachineTranslationSystem.htm -- Urdu--English MT system