Difference between revisions of "Promotion HQ"

From Apertium
Jump to navigation Jump to search
Line 16: Line 16:
 
* Danish <-> Swedish <-> Norwegian Bokmål <-> Norwegian Nynorsk <-> Icelandic <-> Faroese (North-Germanic dialect continuum)
 
* Danish <-> Swedish <-> Norwegian Bokmål <-> Norwegian Nynorsk <-> Icelandic <-> Faroese (North-Germanic dialect continuum)
 
::Between Nynorsk and Bokmål there exists a proprietary implementation, [http://www.nynodata.no/index.htm Nynodata], some discussion [http://nn.wikipedia.org/wiki/Brukardiskusjon:Trondtr here]
 
::Between Nynorsk and Bokmål there exists a proprietary implementation, [http://www.nynodata.no/index.htm Nynodata], some discussion [http://nn.wikipedia.org/wiki/Brukardiskusjon:Trondtr here]
  +
::Fran made a dictionary for Faroese: [http://xixona.dlsi.ua.es/~fran/faroese/index.php here]
 
* Czech <-> Slovak
 
* Czech <-> Slovak
 
* Slovenian <-> Serbo-Croatian <-> Macedonian <-> Bulgarian (South-Slavic dialect continuum)
 
* Slovenian <-> Serbo-Croatian <-> Macedonian <-> Bulgarian (South-Slavic dialect continuum)

Revision as of 20:51, 29 September 2007

Some ideas for expanding and promoting Apertium, like a scratchpad or something.

Ideas for papers

  • The use of lttoolbox to develop analysers for under-resourced languages (e.g. Welsh/Afrikaans ...)
  • Open-source Afrikaans-English machine translation
  • Longest-match left-to-right compound splitting in the context of Afrikaans-English machine translation.
  • Retrieving bilingual dictionary entries using Wikipedia interwiki links.

Ideal pairs for development

These pairs are ideal for development due to the closeness of the languages in question, or historical connection. Some are closer than others, but all are pretty close.

European Union official languages

  • Danish <-> Swedish <-> Norwegian Bokmål <-> Norwegian Nynorsk <-> Icelandic <-> Faroese (North-Germanic dialect continuum)
Between Nynorsk and Bokmål there exists a proprietary implementation, Nynodata, some discussion here
Fran made a dictionary for Faroese: here
  • Czech <-> Slovak
  • Slovenian <-> Serbo-Croatian <-> Macedonian <-> Bulgarian (South-Slavic dialect continuum)
  • Afrikaans <-> Dutch
  • Irish <-> Scots Gaelic — Kevin Scannell already has a system, but it could be Apertiumised.
  • Finnish <-> Estonian (Balto-Finnic, with agglutinative morphology)
  • Romanian <-> Aromanian
  • Italian <-> Neapolitan

Non-EU

  • Hindi <-> Urdu
  • Persian <-> Tajik
  • Northern Sotho <-> Sotho
  • Turkish <-> Azerbaijani <-> Turkmen <-> Tatar (Southwestern-Turkic, Oghuz dialect continuum)
  • Uyghur <-> Uzbek
  • Russian <-> Ukrainian <-> Belarusian (East-Slavic dialect continuum)
  • Dungan <-> Mandarin (not that many people speak Dungan)
  • Indonesian <-> Malaysian

Large pairs for which we should have something

  • Italian <-> French
  • Dutch <-> German
  • Italian <-> Spanish
  • English <-> Spanish

See also