Difference between revisions of "Promotion HQ"

From Apertium
Jump to navigation Jump to search
Line 43: Line 43:
* Northern Sotho <-> Sotho
* Northern Sotho <-> Sotho
* Turkish <-> Azerbaijani <-> Turkmen <-> Tatar (Southwestern-Turkic, ''Oghuz'' dialect continuum)
* Turkish <-> Azerbaijani <-> Turkmen <-> Tatar (Southwestern-Turkic, ''Oghuz'' dialect continuum)
:: see [[Turkic languages]]
* Uyghur <-> Uzbek
* Uyghur <-> Uzbek
* Russian <-> Ukrainian <-> Belarusian (East-Slavic dialect continuum)
* Russian <-> Ukrainian <-> Belarusian (East-Slavic dialect continuum)

Revision as of 01:48, 31 December 2011

Some ideas for expanding and promoting Apertium, like a scratchpad or something.

Ideas for papers

  • The use of lttoolbox to develop analysers for under-resourced languages (e.g. Welsh/Afrikaans ...)
  • Retrieving bilingual dictionary entries using Wikipedia interwiki links.
  • On pragmatic dealing with MWEs
  • On Spanish-French, Catalan-French
  • On apertium-2/3 transfer

Ideal pairs for development

These pairs are ideal for development due to the closeness of the languages in question, or historical connection. Some are closer than others, but all are pretty close.

European Union official languages

  • Danish <-> Swedish <-> Norwegian Bokmål <-> Norwegian Nynorsk <-> Icelandic <-> Faroese (North-Germanic dialect continuum)
see North Germanic languages
  • Slovenian <-> Serbo-Croatian <-> Macedonian <-> Bulgarian (South-Slavic dialect continuum)
see Macedonian and Bulgarian
see Serbo-Croatian and Macedonian
  • Afrikaans <-> Dutch
see Afrikaans and Dutch
  • Irish <-> Scots Gaelic — Kevin Scannell already has a system, but it could be Apertiumised.
see Scottish Gaelic and Irish
  • Czech <-> Slovak
  • Finnish <-> Estonian (Balto-Finnic, with agglutinative morphology)
  • Romanian <-> Aromanian
  • Romanian <-> Italian
  • Italian <-> Neapolitan <-> Piedmontese <-> Friulian
  • English <-> Scots/Ulster Scots (Scots might benefit in some way like Occitan from the standardisation effort as described in Mikel's LREC paper) — the SLC may have funds.

Non-EU

  • Hindi <-> Urdu
see Hindi and Urdu
  • Punjabi <-> Hindi <-> Urdu
  • Punjabi (East) <-> Punjabi (West)
  • Persian <-> Tajik
see Iranian Persian and Tajik
  • North Sámi <-> Lule Sámi
see North Sámi and Lule Sámi
  • Northern Sotho <-> Sotho
  • Turkish <-> Azerbaijani <-> Turkmen <-> Tatar (Southwestern-Turkic, Oghuz dialect continuum)
see Turkic languages
  • Uyghur <-> Uzbek
  • Russian <-> Ukrainian <-> Belarusian (East-Slavic dialect continuum)
  • Dungan <-> Mandarin (not that many people speak Dungan)
  • Indonesian <-> Malaysian
  • Xhosa <-> Zulu
  • Ingush <-> Chechen

Large pairs for which we should have something

These pairs are not really close, but are important languages.

  • Italian <-> French
  • Dutch <-> German
  • Italian <-> Spanish
See Español e italiano
  • Romanian <-> French

Distribution including Apertium

See also: Apertium on Ubuntu, Apertium on Mandriva, Apertium on Mac OS X, Apertium on Fedora, Apertium on Arch Linux, Apertium guide for Windows users, Apertium on Windows

See also