Difference between revisions of "Promotion HQ"
Jump to navigation
Jump to search
m |
|||
(17 intermediate revisions by 4 users not shown) | |||
Line 5: | Line 5: | ||
* The use of lttoolbox to develop analysers for under-resourced languages (e.g. Welsh/Afrikaans ...) |
* The use of lttoolbox to develop analysers for under-resourced languages (e.g. Welsh/Afrikaans ...) |
||
⚫ | |||
* Open-source Afrikaans-English machine translation |
|||
* Longest-match left-to-right compound splitting in the context of Afrikaans-English machine translation. |
|||
⚫ | |||
* On pragmatic dealing with MWEs |
* On pragmatic dealing with MWEs |
||
* On Spanish-French, Catalan-French |
* On Spanish-French, Catalan-French |
||
* On apertium-2/3 transfer |
* On apertium-2/3 transfer |
||
* The construction of a parallel Tagalog-Nenets dependency treebank via the pivot languages of Russian and English |
|||
==Ideal pairs for development== |
==Ideal pairs for development== |
||
Line 18: | Line 17: | ||
* Danish <-> Swedish <-> Norwegian Bokmål <-> Norwegian Nynorsk <-> Icelandic <-> Faroese (North-Germanic dialect continuum) |
* Danish <-> Swedish <-> Norwegian Bokmål <-> Norwegian Nynorsk <-> Icelandic <-> Faroese (North-Germanic dialect continuum) |
||
:: see [[North Germanic languages]] |
|||
::Between Nynorsk and Bokmål there exists a proprietary implementation, [http://www.nynodata.no/index.htm Nynodata], some discussion [http://nn.wikipedia.org/wiki/Brukardiskusjon:Trondtr here] |
|||
::Fran made a dictionary for Faroese: [http://xixona.dlsi.ua.es/~fran/faroese/index.php here] (neither Icelandic nor Faroese are EU official) |
|||
⚫ | |||
* Slovenian <-> Serbo-Croatian <-> Macedonian <-> Bulgarian (South-Slavic dialect continuum) |
* Slovenian <-> Serbo-Croatian <-> Macedonian <-> Bulgarian (South-Slavic dialect continuum) |
||
:: see [[Macedonian and Bulgarian]] |
|||
:: see [[Serbo-Croatian and Macedonian]] |
|||
* Afrikaans <-> Dutch |
* Afrikaans <-> Dutch |
||
:: see [[Afrikaans and Dutch]] |
|||
* Irish <-> Scots Gaelic — Kevin Scannell already has a system, but it could be Apertiumised. |
* Irish <-> Scots Gaelic — Kevin Scannell already has a system, but it could be Apertiumised. |
||
::See [[Scots Gaelic]] and the [[Incubator]] |
|||
:: see [[Scottish Gaelic and Irish]] |
|||
⚫ | |||
* Finnish <-> Estonian (Balto-Finnic, with [[agglutinative morphology]]) |
* Finnish <-> Estonian (Balto-Finnic, with [[agglutinative morphology]]) |
||
:: see [[Finnish and Estonian]] |
|||
* Romanian <-> Aromanian |
* Romanian <-> Aromanian |
||
* Romanian <-> Italian |
* Romanian <-> Italian |
||
Line 34: | Line 36: | ||
* Hindi <-> Urdu |
* Hindi <-> Urdu |
||
:: see [[Hindi and Urdu]] |
|||
* Punjabi <-> Hindi <-> Urdu |
|||
* Punjabi (East) <-> Punjabi (West) |
|||
* Persian <-> Tajik |
* Persian <-> Tajik |
||
:: see [[Iranian Persian and Tajik]] |
|||
* North Sámi <-> Lule Sámi |
|||
:: see [[North Sámi and Lule Sámi]] |
|||
* Northern Sotho <-> Sotho |
* Northern Sotho <-> Sotho |
||
* Turkish <-> Azerbaijani <-> Turkmen <-> Tatar (Southwestern-Turkic, ''Oghuz'' dialect continuum) |
* Turkish <-> Azerbaijani <-> Turkmen <-> Tatar (Southwestern-Turkic, ''Oghuz'' dialect continuum) |
||
:: see [[Turkic languages]] |
|||
* Uyghur <-> Uzbek |
* Uyghur <-> Uzbek |
||
* Russian <-> Ukrainian <-> Belarusian (East-Slavic dialect continuum) |
* Russian <-> Ukrainian <-> Belarusian (East-Slavic dialect continuum) |
||
Line 42: | Line 51: | ||
* Indonesian <-> Malaysian |
* Indonesian <-> Malaysian |
||
* Xhosa <-> Zulu |
* Xhosa <-> Zulu |
||
* Ingush <-> Chechen |
|||
==Large pairs for which we should have something== |
==Large pairs for which we should have something== |
||
Line 50: | Line 60: | ||
* Dutch <-> German |
* Dutch <-> German |
||
* Italian <-> Spanish |
* Italian <-> Spanish |
||
::See [[Español e italiano]] |
|||
* English <-> Spanish |
|||
* Romanian <-> French |
|||
== Distribution including Apertium == |
|||
* http://packages.debian.org/fr/wheezy/apertium, Debian |
|||
See also: [[Apertium on Ubuntu]], [[Apertium on Mandriva]], [[Apertium on Mac OS X]], [[Apertium on Fedora]], [[Apertium on Arch Linux]], [[Apertium guide for Windows users]], [[Apertium on Windows]] |
|||
== See also == |
== See also == |
||
*[[Apertium 3.0 promotion]] |
|||
*[[General press letter]] |
*[[General press letter]] |
||
*[[Press]] |
|||
[[Category:Promotion HQ|*]] |
[[Category:Promotion HQ|*]] |
||
[[Category:Documentation in English]] |
Latest revision as of 15:35, 26 September 2016
Some ideas for expanding and promoting Apertium, like a scratchpad or something.
Ideas for papers[edit]
- The use of lttoolbox to develop analysers for under-resourced languages (e.g. Welsh/Afrikaans ...)
Retrieving bilingual dictionary entries using Wikipedia interwiki links.- On pragmatic dealing with MWEs
- On Spanish-French, Catalan-French
- On apertium-2/3 transfer
- The construction of a parallel Tagalog-Nenets dependency treebank via the pivot languages of Russian and English
Ideal pairs for development[edit]
These pairs are ideal for development due to the closeness of the languages in question, or historical connection. Some are closer than others, but all are pretty close.
European Union official languages[edit]
- Danish <-> Swedish <-> Norwegian Bokmål <-> Norwegian Nynorsk <-> Icelandic <-> Faroese (North-Germanic dialect continuum)
- Slovenian <-> Serbo-Croatian <-> Macedonian <-> Bulgarian (South-Slavic dialect continuum)
- Afrikaans <-> Dutch
- Irish <-> Scots Gaelic — Kevin Scannell already has a system, but it could be Apertiumised.
- Czech <-> Slovak
- Finnish <-> Estonian (Balto-Finnic, with agglutinative morphology)
- Romanian <-> Aromanian
- Romanian <-> Italian
- Italian <-> Neapolitan <-> Piedmontese <-> Friulian
- English <-> Scots/Ulster Scots (Scots might benefit in some way like Occitan from the standardisation effort as described in Mikel's LREC paper) — the SLC may have funds.
Non-EU[edit]
- Hindi <-> Urdu
- see Hindi and Urdu
- Punjabi <-> Hindi <-> Urdu
- Punjabi (East) <-> Punjabi (West)
- Persian <-> Tajik
- North Sámi <-> Lule Sámi
- Northern Sotho <-> Sotho
- Turkish <-> Azerbaijani <-> Turkmen <-> Tatar (Southwestern-Turkic, Oghuz dialect continuum)
- see Turkic languages
- Uyghur <-> Uzbek
- Russian <-> Ukrainian <-> Belarusian (East-Slavic dialect continuum)
- Dungan <-> Mandarin (not that many people speak Dungan)
- Indonesian <-> Malaysian
- Xhosa <-> Zulu
- Ingush <-> Chechen
Large pairs for which we should have something[edit]
These pairs are not really close, but are important languages.
- Italian <-> French
- Dutch <-> German
- Italian <-> Spanish
- Romanian <-> French
Distribution including Apertium[edit]
See also: Apertium on Ubuntu, Apertium on Mandriva, Apertium on Mac OS X, Apertium on Fedora, Apertium on Arch Linux, Apertium guide for Windows users, Apertium on Windows