Difference between revisions of "North Germanic languages"
Jump to navigation
Jump to search
(Category:Documentation in English) |
|||
(7 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
[[Langues nord germaniques|En français]] |
|||
{{TOCD}} |
{{TOCD}} |
||
The '''North Germanic languages''' include Danish (<code> |
The '''North Germanic languages''' include Danish (<code>dan</code>), Faroese (<code>fao</code>), Icelandic (<code>isl</code>), Norwegian (Nynorsk, <code>nno</code> and Bokmål, <code>nob</code>) and Swedish (<code>swe</code>). The languages are related with varying levels of mutual intelligibility. This group would make a nice group for Apertium systems. |
||
There are also some interesting variants which lack navies and armies: |
|||
* [https://en.wikipedia.org/wiki/Elfdalian Elfdalian] (<code>ovd</code>, [http://svn.code.sf.net/p/apertium/svn/incubator/apertium-ovd.ovd.dix tiny ovd.dix]) |
|||
* [https://en.wikipedia.org/wiki/Bornholm_dialect Bornholmsk] (<code>born1251</code> / <code>da-bornholm</code>) |
|||
==Status== |
==Status== |
||
Line 6: | Line 13: | ||
Text in ''italic'' denotes an unreleased pair. |
Text in ''italic'' denotes an unreleased pair. |
||
{| style="text-align: center;" class="wikitable" |
|||
{{North Germanic language translations}} |
|||
|- style="background: #ececec" |
|||
⚫ | |||
⚫ | |||
| '''dan''' || — || ''[[fao-dan]]'' || || [[dan-nor]] || — || — || [[dan-swe]] |
|||
⚫ | |||
| '''fao''' || ''[[fao-dan]]'' || — || ''[[fao-isl]]'' || ''[[fao-nor]]'' || — || — || |
|||
⚫ | |||
| '''isl''' || || ''[[fao-isl]]'' || — || || — || — || [[isl-swe]] |
|||
⚫ | |||
| '''nor''' || [[dan-nor]] || ''[[fao-nor]]'' || || — || — || — || [[swe-nor]] |
|||
⚫ | |||
| '''nob''' || — || — || — || — || — || [[nno-nob]] || — |
|||
|- |
|||
| '''nno''' || — || — || — || — || [[nno-nob]] || — || — |
|||
⚫ | |||
| '''swe''' || [[dan-swe]] || || [[isl-swe]] || [[swe-nor]] || — || — || — |
|||
⚫ | |||
⚫ | |||
==Existing== |
==Existing== |
||
Line 15: | Line 41: | ||
! Language !! File !! Paradigms !! Lemmata |
! Language !! File !! Paradigms !! Lemmata |
||
|- |
|- |
||
| Norwegian Nynorsk || [ |
| Norwegian Nynorsk || [https://github.com/apertium/apertium-nno/apertium-nno.nno.dix apertium-nno.nno.dix] || 1243 || {{#lst:apertium-nno/stats|stems}} |
||
|- |
|- |
||
| Norwegian Bokmål || [ |
| Norwegian Bokmål || [https://github.com/apertium/apertium-nob/apertium-nob.nob.dix apertium-nob.nob.dix] || 1335 || {{#lst:apertium-nob/stats|stems}} |
||
|- |
|- |
||
| Swedish || [ |
| Swedish || [https://github.com/apertium/apertium-swe/apertium-swe.swe.dix apertium-swe.swe.dix] || 1895 || {{#lst:apertium-swe/stats|stems}} |
||
|- |
|- |
||
| Danish || [ |
| Danish || [https://github.com/apertium/apertium-dan/apertium-dan.dan.dix apertium-dan.dan.dix] || 713 || {{#lst:apertium-dan/stats|stems}} |
||
|- |
|- |
||
| |
| Icelandic || [https://github.com/apertium/apertium-isl/apertium-isl.isl.dix apertium-isl.isl.dix] || 1,881 || {{#lst:apertium-isl/stats|stems}} |
||
|- |
|- |
||
| |
| Faroese || [https://github.com/apertium/apertium-fao/apertium-fao.fao.dix apertium-fao.fao.dix] || 113 || {{#lst:apertium-fao/stats|stems}} |
||
|- |
|- |
||
|} |
|} |
||
(though see https://github.com/giellalt/lang-fao for a more comprehensive Faroese analyser from Giellatekno) |
|||
==Resources== |
==Resources== |
||
Resources listed below will be useful in building machine translation systems for these languages. |
Resources listed below will be useful in building machine translation systems for these languages. |
||
;Monolingual |
|||
{|class=wikitable |
|||
⚫ | |||
⚫ | |||
| Norwegian || [http://www.edd.uio.no/prosjekt/ordbanken/ Norsk ordbank] || Large >100,000 lemma morphological dictionary of both Nynorsk and Bokmål, GPL. || [[Norsk ordbank]], [[Norwegian]] |
|||
⚫ | |||
| Norwegian || [http://maximos.aksis.uib.no/Aksis-wiki/Oslo-Bergen_Tagger Oslo-Bergen tagger] || Constraint grammar tagger for Norwegian, GPL. (converted for CG-3) || [[Norwegian]] |
|||
⚫ | |||
| Swedish || [http://w3.msi.vxu.se/~nivre/research/Talbanken05.html Talbanken] || A 300,000-word tree-bank: it is in XML, all words are nicely tagged with PAROLE-style tags. || |
|||
⚫ | |||
| Swedish || [http://spraakbanken.gu.se/sal/eng/ SALDO] || Swedish inflectional lexicon, LGPL || |
|||
⚫ | |||
| Danish || [http://www.isv.cbs.dk/~mbk/treebank/ Danish Dependency Treebank] || Danish tree bank, 100,000-word, XML, PAROLE tagged, under the GPL. || |
|||
⚫ | |||
| Danish || [http://wordnet.dk/dannet/menu?item=0&lang=1 DanNet] || Danish WordNet (~32,000 words), MIT licensed. |
|||
⚫ | |||
| Icelandic || || || [[Icelandic and English]] |
|||
|- |
|||
| Faroese || [http://apertium.svn.sourceforge.net/svnroot/apertium/trunk/incubator/apertium-fo-is.fo.rlx apertium-fo-is.fo.rlx] || A [[constraint grammar]] for morphological disambiguation with ~120 rules || |
|||
|- |
|||
⚫ | |||
;Bilingual |
;Bilingual |
||
Line 65: | Line 69: | ||
|Icelandic—Faroese || Apertium bidix with ~30 entries || || |
|Icelandic—Faroese || Apertium bidix with ~30 entries || || |
||
|- |
|- |
||
|Norwegian (Nynorsk)—Norwegian (Bokmål) || Apertium bidix with ~36,000 entries || || [[Norwegian]] |
|||
|- |
|||
|Swedish—Danish || Apertium bidix with ~2,000 entries || || [[Swedish and Danish]] |
|||
|} |
|} |
||
Latest revision as of 11:55, 9 November 2022
The North Germanic languages include Danish (dan
), Faroese (fao
), Icelandic (isl
), Norwegian (Nynorsk, nno
and Bokmål, nob
) and Swedish (swe
). The languages are related with varying levels of mutual intelligibility. This group would make a nice group for Apertium systems.
There are also some interesting variants which lack navies and armies:
- Elfdalian (
ovd
, tiny ovd.dix) - Bornholmsk (
born1251
/da-bornholm
)
Status[edit]
Text in italic denotes an unreleased pair.
dan | fao | isl | nor | nob | nno | swe | |
---|---|---|---|---|---|---|---|
dan | — | fao-dan | dan-nor | — | — | dan-swe | |
fao | fao-dan | — | fao-isl | fao-nor | — | — | |
isl | fao-isl | — | — | — | isl-swe | ||
nor | dan-nor | fao-nor | — | — | — | swe-nor | |
nob | — | — | — | — | — | nno-nob | — |
nno | — | — | — | — | nno-nob | — | — |
swe | dan-swe | isl-swe | swe-nor | — | — | — |
Existing[edit]
- Dictionaries
- See also: List of dictionaries
Language | File | Paradigms | Lemmata |
---|---|---|---|
Norwegian Nynorsk | apertium-nno.nno.dix | 1243 | 182,497 |
Norwegian Bokmål | apertium-nob.nob.dix | 1335 | 246,281 |
Swedish | apertium-swe.swe.dix | 1895 | 138,490 |
Danish | apertium-dan.dan.dix | 713 | 52,133 |
Icelandic | apertium-isl.isl.dix | 1,881 | 8,770 |
Faroese | apertium-fao.fao.dix | 113 | 2,318 |
(though see https://github.com/giellalt/lang-fao for a more comprehensive Faroese analyser from Giellatekno)
Resources[edit]
Resources listed below will be useful in building machine translation systems for these languages.
- Bilingual
Language pair | Resource | Description | See also |
---|---|---|---|
Icelandic—Danish | Apertium bidix with ~960 entries | ||
Icelandic—Faroese | Apertium bidix with ~30 entries |
Funding possibilities[edit]
Samples[edit]
Language | Text |
---|---|
Danish | Alle mennesker er født frie og lige i værdighed og rettigheder. De er udstyret med fornuft og samvittighed, og de bør handle mod hverandre i en broderskabets ånd. |
Norwegian (Bokmål) | Alle mennesker er født frie og med samme menneskeverd og menneskerettigheter. De er utstyrt med fornuft og samvittighet og bør handle mot hverandre i brorskapets ånd. |
Norwegian (Nynorsk) | Alle menneske er fødde til fridom og med same menneskeverd og menneskerettar. Dei har fått fornuft og samvit og skal leve med kvarandre som brør. |
Swedish | Alla människor är födda fria och lika i värde och rättigheter. De har utrustats med förnuft och samvete och bör handla gentemot varandra i en anda av gemenskap. |
Faroese | Øll menniskju eru fødd fræls og jøvn til virðingar og mannarættindi. Tey hava skil og samvitsku og eiga at fara hvørt um annað í bróðuranda. |
Icelandic | Hver maður er borinn frjáls og jafn öðrum að virðingu og réttindum. Menn eru gæddir vitsmunum og samvizku, og ber þeim að breyta bróðurlega hverjum við annan. |