Difference between revisions of "Specific resources per language"

From Apertium
Jump to navigation Jump to search
(Category:Documentation in English)
(14 intermediate revisions by 5 users not shown)
Line 1: Line 1:
 
{{TOCD}}
 
{{TOCD}}
The incubator can be found [https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/ here]. It provides a place for people to put dictionaries and other stuff that is useful in constructing language pairs.
+
The incubator can be found in the 'incubator' column in https://apertium.github.io/apertium-on-github/source-browser.html. It houses language pairs which haven't completely matured and are under work.
  +
   
 
==Specific resources per language==
 
==Specific resources per language==
Line 6: Line 7:
 
Here are some links to resources that might be useful for expanding on work in the Incubator. Below you can put resources which will be useful in the construction. Try and mark them for licence, or at least free/non-free.
 
Here are some links to resources that might be useful for expanding on work in the Incubator. Below you can put resources which will be useful in the construction. Try and mark them for licence, or at least free/non-free.
   
  +
See also the individual language pages.
===Albanian===
 
  +
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-mk-sq.sq.dix apertium-mk-sq.sq.dix]''
 
 
===[[Albanian]]===
  +
:''Dictionary: [https://github.com/apertium/apertium-sqi/blob/master/apertium-sqi.sqi.dix Albanian Monodix]''
   
 
;Resources
 
;Resources
Line 13: Line 16:
 
* http://www.idividi.com.mk/recnik/index.htm -- albanian--macedonian dictionary (non-free)
 
* http://www.idividi.com.mk/recnik/index.htm -- albanian--macedonian dictionary (non-free)
   
===Armenian===
+
===[[Armenian]]===
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-hy-en.hy.dix apertium-hy-en.hy.dix]''
+
:''Dictionary: [https://github.com/apertium/apertium-hye/blob/master/apertium-hye.hye.dix Armenian Monodix]''
   
 
;Resources
 
;Resources
Line 20: Line 23:
 
* http://www.armeniapedia.org/index.php?title=Category:Armenian_Language_Lessons
 
* http://www.armeniapedia.org/index.php?title=Category:Armenian_Language_Lessons
   
===Assamese - Hindi===
+
===[[Assamese and Hindi]]===
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-as-hi.as.dix apertium-as-hi.hi.dix apertium-as-hi.as-hi.dix apertium-as-hi.trules.xml]''
+
:''Dictionary: [https://github.com/apertium/apertium-as-hi/blob/91f3c38b0c636deb620cbd27725d63dd763c5f0b/apertium-as-hi.hi.dix Assemese-Hindi Bidix]''
  +
   
 
--- Anusuya
 
--- Anusuya
   
===Belarusian===
+
===[[Belarusian]]===
   
 
* [http://www.vitba.org/fofmb/fofmb.html GFDL grammar of the language]
 
* [http://www.vitba.org/fofmb/fofmb.html GFDL grammar of the language]
   
===Bengali===
+
===[[Bengali]]===
   
 
* http://bengalinux.sourceforge.net/cgi-bin/anubadok/index.pl -- Free software translation for English→Bengali
 
* http://bengalinux.sourceforge.net/cgi-bin/anubadok/index.pl -- Free software translation for English→Bengali
 
* http://anubadok.sf.net/ -- See above
 
* http://anubadok.sf.net/ -- See above
   
===Bulgarian===
+
===[[Bulgarian]]===
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-mk-bg.bg.dix apertium-mk-bg.bg.dix]''
+
:''Dictionary: [https://raw.githubusercontent.com/apertium/apertium-bul/master/apertium-bul.bul.dix Bulgarian Monodix]''
   
 
;Resources
 
;Resources
Line 41: Line 45:
 
* [http://www.sfs.nphil.uni-tuebingen.de/iscl/Theses/zhechev.pdf Bulgarian verbal morphology]
 
* [http://www.sfs.nphil.uni-tuebingen.de/iscl/Theses/zhechev.pdf Bulgarian verbal morphology]
   
===Cornish===
+
===[[Cornish]]===
  +
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-cy-kw.kw.dix apertium-cy-kw.kw.dix]''
 
 
:''Dictionary: [https://sourceforge.net/projects/apertium/files/apertium-cy-en/0.1.0/ Cornish Monodix from SourceForge]''
  +
  +
'''This resource has not been migrated to GitHub from SVN
  +
'''
   
 
;Resources
 
;Resources
Line 49: Line 57:
 
* [http://kevindonnelly.org.uk/kernewek/ Cornish-Welsh bilingual wordlist]
 
* [http://kevindonnelly.org.uk/kernewek/ Cornish-Welsh bilingual wordlist]
   
===Czech===
+
===[[Czech]]===
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-pl-cs.cs.dix.xml apertium-pl-cs.cs.dix.xml]''
+
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-pl-cs.cs.dix.xml apertium-pl-cs.cs.dix.xml]''
  +
'''This resource has not been migrated to GitHub from SVN
  +
'''
  +
  +
:''Dictionary: [https://github.com/apertium/apertium-eo-cs/blob/c16fa21194a285941307a68e420c194a1825ebc3/apertium-eo-cs.eo-cs.dix Czech-Esperanto Bidix]''
  +
:''Dictionary: [https://github.com/apertium/apertium-cs-sl/tree/062fa172705e16f77302a8096df3733581079fb8 Czech-Slovenian Bidix]''
 
;Resources
 
;Resources
   
Line 58: Line 71:
 
* [http://ufal.mff.cuni.cz/pdt/Morphology_and_Tagging/Morphology/index.html Czech morphological guesser] - 'free', but not open source
 
* [http://ufal.mff.cuni.cz/pdt/Morphology_and_Tagging/Morphology/index.html Czech morphological guesser] - 'free', but not open source
   
===Faroese===
+
===[[Faroese]]===
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-fo-is.fo.dix apertium-fo-is.fo.dix]''
+
:''Dictionary: [https://github.com/apertium/apertium-fao/blob/master/apertium-fao.fao.dix Faroese Monodix]''
   
 
;Resources
 
;Resources
 
* [http://giellatekno.uit.no/cgi/d-fao.eng.html U. Tromsø -- Faroese analyser ]
 
* [http://giellatekno.uit.no/cgi/d-fao.eng.html U. Tromsø -- Faroese analyser ]
* [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-fo-is.fo.rle Faroese Constraint Grammar]
+
* [https://github.com/apertium/apertium-fao-isl/blob/master/apertium-fao-isl.fao-isl.rlx Faroese Constraint Grammar]
  +
* [http://www.archive.org/details/frskanthologi00denmgoog Faroese-Danish dictionary from 1886]
   
===Finnish===
+
===[[Finnish]]===
 
{{see-also|Omorfi}}
 
{{see-also|Omorfi}}
 
;Resources
 
;Resources
Line 80: Line 94:
 
</pre>
 
</pre>
   
===German - English===
+
===[[German and English]]===
   
 
German-English bilingual dictionary (>216,000 entries), generated from linguistic data (GPL Version 2 or later) available for [http://www-user.tu-chemnitz.de/~fri/ding/ "Ding: A Dictionary LookUp program"] (version 1.5 2007-04-09) from Frank Richter, [http://tu-chemnitz.de Technische Universität Chemnitz]
 
German-English bilingual dictionary (>216,000 entries), generated from linguistic data (GPL Version 2 or later) available for [http://www-user.tu-chemnitz.de/~fri/ding/ "Ding: A Dictionary LookUp program"] (version 1.5 2007-04-09) from Frank Richter, [http://tu-chemnitz.de Technische Universität Chemnitz]
   
:''Dictionary: [https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-de-en apertium-de-en.dix]''
+
:''[https://github.com/apertium/apertium-eng-deu/blob/master/apertium-eng-deu.eng-deu.dix German-English Dictionary]''
   
===Greek===
+
===[[Greek]]===
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-en-el.el.dix apertium-en-el.el.dix]
+
:''Dictionary: [https://github.com/apertium/apertium-ell/blob/master/apertium-ell.ell.dix Greek Monodix]
  +
:''Greek-English Dictionary: [https://github.com/apertium/apertium-ell-eng/blob/master/apertium-ell-eng.eng.dix Greek-English Dictionary]
   
 
;Resources
 
;Resources
Line 93: Line 108:
 
* Greek <-> Ukranian, Russian, Polish Grammar & Dictionary: http://ellinika.gnu.org.ua/
 
* Greek <-> Ukranian, Russian, Polish Grammar & Dictionary: http://ellinika.gnu.org.ua/
   
===Hebrew===
+
===[[Hebrew]]===
   
 
;Resources
 
;Resources
Line 104: Line 119:
 
* http://www.code972.com/blog/hebmorph/ HebMorph is the analyser powering hspell's capabilities -- GPL
 
* http://www.code972.com/blog/hebmorph/ HebMorph is the analyser powering hspell's capabilities -- GPL
   
===Hindi===
+
===[[Hindi]]===
 
{{see-also|Hindi}}
 
{{see-also|Hindi}}
   
Line 111: Line 126:
 
* POS tagged English-Hindi wordlist: http://indlinux.sourceforge.net/downloads/files/hindidict.txt.bz2
 
* POS tagged English-Hindi wordlist: http://indlinux.sourceforge.net/downloads/files/hindidict.txt.bz2
   
* https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-en-hi
+
* https://github.com/unhammer/apertium-en-hi/blob/master/apertium-en-hi.en.dix
* https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-hi-en-unicode
+
* https://github.com/apertium/apertium-hin/blob/master/apertium-hin.hin.dix
* https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-hi.hi.dix
+
* https://github.com/apertium/apertium-urd-hin/blob/master/dev/en-hi-ur.list
* https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-hi.hi_WX.dix
+
* https://github.com/apertium/apertium-urd-hin/blob/master/apertium-urd-hin.urd-hin.dix
* https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-hi-ur.hi.dix
 
* https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-hi-ur.hi.dix.old
 
   
  +
===Iranian Persian===
 
  +
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-tg-fa.fa.dix apertium-tg-fa.fa.dix]''
 
 
===[[Iranian Persian]]===
  +
:''Dictionary: [https://github.com/apertium/apertium-pes/blob/master/apertium-pes.pes.dix Persian Monodix]''
   
 
;Resources
 
;Resources
Line 125: Line 140:
 
* [http://books.google.com/books?vid=OCLC20216670&id=Ru1ncSqiRXkC&printsec=titlepage&hl=de#PPA24,M1 Grammar of Persian]
 
* [http://books.google.com/books?vid=OCLC20216670&id=Ru1ncSqiRXkC&printsec=titlepage&hl=de#PPA24,M1 Grammar of Persian]
   
===Ingush===
+
===[[Ingush]]===
   
 
; Resources
 
; Resources
Line 132: Line 147:
 
* [http://books.google.com/books?id=J7wqVHeRWdwC&pg=PA5&lpg=PA5&dq=ingush+father&source=bl&ots=N8TDZudzGZ&sig=JO9X_Y9gio7dUhZWeyZX7j17iPw&hl=ca&ei=vfq4TM6CH86OjAfO94XaDg&sa=X&oi=book_result&ct=result&resnum=3&ved=0CB8Q6AEwAg#v=onepage&q=ingush%20father&f=false Ingush-English dict] (non-free)
 
* [http://books.google.com/books?id=J7wqVHeRWdwC&pg=PA5&lpg=PA5&dq=ingush+father&source=bl&ots=N8TDZudzGZ&sig=JO9X_Y9gio7dUhZWeyZX7j17iPw&hl=ca&ei=vfq4TM6CH86OjAfO94XaDg&sa=X&oi=book_result&ct=result&resnum=3&ved=0CB8Q6AEwAg#v=onepage&q=ingush%20father&f=false Ingush-English dict] (non-free)
   
===Lithuanian===
+
===[[Latvian]]===
  +
;Resources
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-en-lt.lt.dix apertium-en-lt.lt.dix]''
 
  +
* https://github.com/PeterisP/morphology GPL full-form dictionary (https://github.com/PeterisP/morphology/blob/master/src/main/resources/Lexicon.xml)
  +
  +
;See also
  +
* [[Latvian and Russian]]
  +
  +
===[[Lithuanian]]===
  +
:''Dictionary: [https://github.com/apertium/apertium-lit/blob/master/apertium-lit.lit.dix Lithuanian Monodix]''
   
 
;Resources
 
;Resources
   
===Ossetian===
+
===[[Nogai]]===
  +
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-os-fa.os.dix apertium-os-fa.os.dix]''
 
  +
; Resources
  +
  +
* [http://ksirov.ru/%D1%8F%D0%B7%D1%8B%D0%BA%D0%B8/%D0%BD%D0%BE%D0%B3%D0%B0%D0%B9%D1%81%D0%BA%D0%B8%D0%B9 Grammar Sketch and Russian-Nogai dictionary]
  +
  +
===[[Ossetian]]===
  +
:''Dictionary: [https://github.com/apertium/apertium-oss/blob/master/apertium-oss.oss.dix Ossetian Monodix]''
   
 
;Resources
 
;Resources
Line 145: Line 173:
 
* [http://www.ossetic-studies.org/ Ossetic National Corpus]
 
* [http://www.ossetic-studies.org/ Ossetic National Corpus]
   
===Piemontese===
+
===[[Piemontese]]===
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-it-pms.pms.dix apertium-it-pms.pms.dix]''
+
:''Dictionary: [https://sourceforge.net/p/apertium/svn/HEAD/tree/incubator/apertium-it-pms.pms.dix Piemontese Monodix from SourceForge]''
  +
'''This resource has not been migrated to GitHub from SVN
  +
'''
  +
 
;Resources
 
;Resources
   
Line 152: Line 183:
 
* http://digilander.libero.it/dotor43/indexit.html -- Piemontese grammar incl. 17k word Piemontese--Italian dictionary (POS tagged and partly annotated for inflection). site suggests "© These pages can be freely used for all purposes, but not for political reasons, and not against the laws (no matter what is the country)."
 
* http://digilander.libero.it/dotor43/indexit.html -- Piemontese grammar incl. 17k word Piemontese--Italian dictionary (POS tagged and partly annotated for inflection). site suggests "© These pages can be freely used for all purposes, but not for political reasons, and not against the laws (no matter what is the country)."
   
===Portuguese===
+
===[[Portuguese]]===
   
 
Even if Apertium has a stable es-pt pair, the coverage of the Brazilian Portuguese Dictionary built at NILC (Universidade de Sao Paulo) for Unitex is much better, and could be used perhaps to improve it.
 
Even if Apertium has a stable es-pt pair, the coverage of the Brazilian Portuguese Dictionary built at NILC (Universidade de Sao Paulo) for Unitex is much better, and could be used perhaps to improve it.
Line 162: Line 193:
 
We believe it has a LGPL license.
 
We believe it has a LGPL license.
   
===Punjabi===
+
===[[Punjabi]]===
   
 
; Resources
 
; Resources
Line 168: Line 199:
 
* [http://www.lama.univ-savoie.fr/~humayoun/punjabi/index.html Punjabi lexicon]
 
* [http://www.lama.univ-savoie.fr/~humayoun/punjabi/index.html Punjabi lexicon]
   
===Quechua===
+
===[[Quechua]]===
   
 
;Resources
 
;Resources
Line 175: Line 206:
 
* AVENUE Quechua-Spanish system. (ask [[User:Francis Tyers|Francis Tyers]])
 
* AVENUE Quechua-Spanish system. (ask [[User:Francis Tyers|Francis Tyers]])
   
===Russian===
+
===[[Russian]]===
   
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-pl-ru.ru.dix.xml monodix]''
+
:''Dictionary: [https://github.com/apertium/apertium-rus/blob/master/apertium-rus.rus.dix monodix]''
:''Bidix: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-pl-ru.pl-ru.dix.xml Polish-Russian]''
+
:''Bidix: [https://github.com/apertium/apertium-pol-rus/blob/master/apertium-pol-rus.pol-rus.dix Polish-Russian]''
:''Bidix: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-en-ru.en-ru.dix.xml English-Russian]
+
:''Bidix: [https://github.com/apertium/apertium-rus-eng/blob/master/apertium-ru-en.ru.dix English-Russian]
   
 
;Resources
 
;Resources
Line 191: Line 222:
 
* [http://www.lugattj.com/news.php?tid=1&ln=en Another Tajik--Russian dictionary]
 
* [http://www.lugattj.com/news.php?tid=1&ln=en Another Tajik--Russian dictionary]
   
===Sanskrit '''संस्कृतम्'''===
+
===[[Sanskrit]] '''संस्कृतम्'''===
:''Dictionary: [https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-sa-XX apertium-sa-XX]
+
:''Dictionary: [https://github.com/apertium/apertium-san/blob/master/apertium-san.san.dix Sanskrit Monodix]
   
 
;Resources
 
;Resources
Line 199: Line 230:
 
* [http://www.sanskrit-lexicon.uni-koeln.de/download.html Material available for download].
 
* [http://www.sanskrit-lexicon.uni-koeln.de/download.html Material available for download].
   
===Slovakian===
+
===[[Slovakian]]===
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-pl-sk.sk.dix apertium-pl-sk.sk.dix]''
+
:''Dictionary: [https://github.com/apertium/apertium-slk/blob/master/apertium-slk.slk.dix Slovak Monodix]''
   
 
;Resources
 
;Resources
Line 209: Line 240:
 
* http://www.juls.savba.sk/msj/
 
* http://www.juls.savba.sk/msj/
   
===Urdu===
+
===[[Thai]]===
  +
* https://github.com/veer66/Yaitron Yaitron English-Thai and Thai-English XML dictionary, license seems standard 4-clause
:''Dictionary: [http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-hi-ur.ur.dix apertium-hi-ur.ur.dix]''
 
  +
  +
===[[Urdu]]===
  +
:''Dictionary: [https://github.com/apertium/apertium-urd/blob/master/apertium-urd.urd.dix Urdu Monodix]''
  +
:''Bidix: [https://github.com/apertium/apertium-urd-hin/blob/master/apertium-urd-hin.urd-hin.dix Hindi-Urdu Monodix]''
   
 
;Resources
 
;Resources
 
* http://www.lama.univ-savoie.fr/~humayoun/UrduMorph/ &mdash; GPL analyser of Urdu
 
* http://www.lama.univ-savoie.fr/~humayoun/UrduMorph/ &mdash; GPL analyser of Urdu
 
* http://www.crulp.org/software/langproc/E2UMachineTranslationSystem.htm -- Urdu--English MT system
 
* http://www.crulp.org/software/langproc/E2UMachineTranslationSystem.htm -- Urdu--English MT system
  +
  +
  +
==Github Migration==
  +
  +
For languages whose resources are not yet on Github, you can use [[apertium-init]] to make their corresponding repository and add the files from SVN to that repositiry.
  +
  +
   
   

Revision as of 13:39, 30 November 2018

The incubator can be found in the 'incubator' column in https://apertium.github.io/apertium-on-github/source-browser.html. It houses language pairs which haven't completely matured and are under work.


Specific resources per language

Here are some links to resources that might be useful for expanding on work in the Incubator. Below you can put resources which will be useful in the construction. Try and mark them for licence, or at least free/non-free.

See also the individual language pages.

Albanian

Dictionary: Albanian Monodix
Resources

Armenian

Dictionary: Armenian Monodix
Resources

Assamese and Hindi

Dictionary: Assemese-Hindi Bidix


--- Anusuya

Belarusian

Bengali

Bulgarian

Dictionary: Bulgarian Monodix
Resources

Cornish

Dictionary: Cornish Monodix from SourceForge

This resource has not been migrated to GitHub from SVN

Resources

Czech

Dictionary: apertium-pl-cs.cs.dix.xml

This resource has not been migrated to GitHub from SVN

Dictionary: Czech-Esperanto Bidix
Dictionary: Czech-Slovenian Bidix
Resources

Faroese

Dictionary: Faroese Monodix
Resources

Finnish

See also: Omorfi
Resources
s = lemma
hn = homonymy ref
t = inflection info
tn = inflection number (referring to table)
av = ref to consonant gradation

German and English

German-English bilingual dictionary (>216,000 entries), generated from linguistic data (GPL Version 2 or later) available for "Ding: A Dictionary LookUp program" (version 1.5 2007-04-09) from Frank Richter, Technische Universität Chemnitz

German-English Dictionary

Greek

Dictionary: Greek Monodix
Greek-English Dictionary: Greek-English Dictionary
Resources

Hebrew

Resources

Hindi

See also: Hindi
Resources


Iranian Persian

Dictionary: Persian Monodix
Resources

Ingush

Resources

Latvian

Resources
See also

Lithuanian

Dictionary: Lithuanian Monodix
Resources

Nogai

Resources

Ossetian

Dictionary: Ossetian Monodix
Resources

Piemontese

Dictionary: Piemontese Monodix from SourceForge

This resource has not been migrated to GitHub from SVN

Resources

Portuguese

Even if Apertium has a stable es-pt pair, the coverage of the Brazilian Portuguese Dictionary built at NILC (Universidade de Sao Paulo) for Unitex is much better, and could be used perhaps to improve it.

Resources

We believe it has a LGPL license.

Punjabi

Resources

Quechua

Resources

Russian

Dictionary: monodix
Bidix: Polish-Russian
Bidix: English-Russian
Resources

Sanskrit संस्कृतम्

Dictionary: Sanskrit Monodix
Resources

Slovakian

Dictionary: Slovak Monodix
Resources

Thai

Urdu

Dictionary: Urdu Monodix
Bidix: Hindi-Urdu Monodix
Resources


Github Migration

For languages whose resources are not yet on Github, you can use apertium-init to make their corresponding repository and add the files from SVN to that repositiry.