Difference between revisions of "Promotion HQ"

Revision as of 11:06, 3 February 2009

Ideas for papers

The use of lttoolbox to develop analysers for under-resourced languages (e.g. Welsh/Afrikaans ...)
~~Retrieving bilingual dictionary entries using Wikipedia interwiki links.~~
On pragmatic dealing with MWEs
On Spanish-French, Catalan-French
On apertium-2/3 transfer

Ideal pairs for development

These pairs are ideal for development due to the closeness of the languages in question, or historical connection. Some are closer than others, but all are pretty close.

European Union official languages

Danish <-> Swedish <-> Norwegian Bokmål <-> Norwegian Nynorsk <-> Icelandic <-> Faroese (North-Germanic dialect continuum)

Between Nynorsk and Bokmål there exists a proprietary implementation, Nynodata, some discussion here

Fran made a dictionary for Faroese: here (neither Icelandic nor Faroese are EU official)

Czech <-> Slovak
Slovenian <-> Serbo-Croatian <-> Macedonian <-> Bulgarian (South-Slavic dialect continuum)
Afrikaans <-> Dutch
Irish <-> Scots Gaelic — Kevin Scannell already has a system, but it could be Apertiumised.

see Scottish Gaelic and Irish

Finnish <-> Estonian (Balto-Finnic, with agglutinative morphology)
Romanian <-> Aromanian
Romanian <-> Italian
Italian <-> Neapolitan <-> Piedmontese <-> Friulian
English <-> Scots/Ulster Scots (Scots might benefit in some way like Occitan from the standardisation effort as described in Mikel's LREC paper) — the SLC may have funds.

Non-EU

Hindi <-> Urdu
Persian <-> Tajik
Northern Sotho <-> Sotho
Turkish <-> Azerbaijani <-> Turkmen <-> Tatar (Southwestern-Turkic, Oghuz dialect continuum)
Uyghur <-> Uzbek
Russian <-> Ukrainian <-> Belarusian (East-Slavic dialect continuum)
Dungan <-> Mandarin (not that many people speak Dungan)
Indonesian <-> Malaysian
Xhosa <-> Zulu
North Sámi <-> Lule Sámi

see North Sámi and Lule Sámi

Large pairs for which we should have something

These pairs are not really close, but are important languages.

Italian <-> French
Dutch <-> German
Italian <-> Spanish

@@ Line 5: / Line 5: @@
 * The use of lttoolbox to develop analysers for under-resourced languages (e.g. Welsh/Afrikaans ...)
+* <s>Retrieving bilingual dictionary entries using Wikipedia interwiki links.</s>
-* Open-source Afrikaans-English machine translation
-* Longest-match left-to-right compound splitting in the context of Afrikaans-English machine translation.
-* Retrieving bilingual dictionary entries using Wikipedia interwiki links.
 * On pragmatic dealing with MWEs
 * On Spanish-French, Catalan-French
@@ Line 23: / Line 21: @@
 * Slovenian <-> Serbo-Croatian <-> Macedonian <-> Bulgarian (South-Slavic dialect continuum)
 * Afrikaans <-> Dutch
 * Irish <-> Scots Gaelic &mdash; Kevin Scannell already has a system, but it could be Apertiumised.
-::See [[Scots Gaelic]] and the [[Incubator]]
+:: see [[Scottish Gaelic and Irish]]
 * Finnish <-> Estonian (Balto-Finnic, with [[agglutinative morphology]])
 * Romanian <-> Aromanian
@@ Line 42: / Line 40: @@
 * Indonesian <-> Malaysian
 * Xhosa <-> Zulu
+* North Sámi <-> Lule Sámi
+:: see [[North Sámi and Lule Sámi]]
 ==Large pairs for which we should have something==

Difference between revisions of "Promotion HQ"

Revision as of 11:06, 3 February 2009

Contents

Ideas for papers

Ideal pairs for development

European Union official languages

Non-EU

Large pairs for which we should have something

See also

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools