Difference between revisions of "North Germanic languages"

From Apertium
Jump to navigation Jump to search
 
(29 intermediate revisions by 7 users not shown)
Line 1: Line 1:
[[Langues nord germaniques|En français]]

{{TOCD}}
{{TOCD}}
The '''North Germanic languages''' include Danish (<code>da</code>), Faroese (<code>fo</code>), Icelandic (<code>is</code>), Norwegian (Nynorsk, <code>nn</code> and Bokmål, <code>nb</code>) and Swedish (<code>sv</code>). The languages are related with varying levels of mutual intelligibility. This group would make a nice group for Apertium systems.
The '''North Germanic languages''' include Danish (<code>dan</code>), Faroese (<code>fao</code>), Icelandic (<code>isl</code>), Norwegian (Nynorsk, <code>nno</code> and Bokmål, <code>nob</code>) and Swedish (<code>swe</code>). The languages are related with varying levels of mutual intelligibility. This group would make a nice group for Apertium systems.

There are also some interesting variants which lack navies and armies:
* [https://en.wikipedia.org/wiki/Elfdalian Elfdalian] (<code>ovd</code>, [http://svn.code.sf.net/p/apertium/svn/incubator/apertium-ovd.ovd.dix tiny ovd.dix])
* [https://en.wikipedia.org/wiki/Bornholm_dialect Bornholmsk] (<code>born1251</code> / <code>da-bornholm</code>)


==Status==

Text in ''italic'' denotes an unreleased pair.

{| style="text-align: center;" class="wikitable"
|- style="background: #ececec"
! !! dan !! fao !! isl !! nor !! nob !! nno !! swe
|-
| '''dan''' || &mdash; || ''[[fao-dan]]'' || || [[dan-nor]] || &mdash; || &mdash; || [[dan-swe]]
|-
| '''fao''' || ''[[fao-dan]]'' || &mdash; || ''[[fao-isl]]'' || ''[[fao-nor]]'' || &mdash; || &mdash; ||
|-
| '''isl''' || || ''[[fao-isl]]'' || &mdash; || || &mdash; || &mdash; || [[isl-swe]]
|-
| '''nor''' || [[dan-nor]] || ''[[fao-nor]]'' || || &mdash; || &mdash; || &mdash; || [[swe-nor]]
|-
| '''nob''' || &mdash; || &mdash; || &mdash; || &mdash; || &mdash; || [[nno-nob]] || &mdash;
|-
| '''nno''' || &mdash; || &mdash; || &mdash; || &mdash; || [[nno-nob]] || &mdash; || &mdash;
|-
| '''swe''' || [[dan-swe]] || || [[isl-swe]] || [[swe-nor]] || &mdash; || &mdash; || &mdash;
|-
|}



==Existing==
==Existing==
Line 9: Line 41:
! Language !! File !! Paradigms !! Lemmata
! Language !! File !! Paradigms !! Lemmata
|-
|-
| Norwegian Nynorsk || [http://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-nn-nb/apertium-nn-nb.nn.dix apertium-nn-nb.nn.dix] || 770 || 83,584
| Norwegian Nynorsk || [https://github.com/apertium/apertium-nno/apertium-nno.nno.dix apertium-nno.nno.dix] || 1243 || {{#lst:apertium-nno/stats|stems}}
|-
|-
| Norwegian Bokmål || [http://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-nn-nb/apertium-nn-nb.nb.dix apertium-nn-nb.nb.dix] || 705 || 119,567
| Norwegian Bokmål || [https://github.com/apertium/apertium-nob/apertium-nob.nob.dix apertium-nob.nob.dix] || 1335 || {{#lst:apertium-nob/stats|stems}}
|-
|-
| Swedish || [http://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-sv-da/apertium-sv-da.sv.dix apertium-sv-da.sv.dix] || 227 || 8,729
| Swedish || [https://github.com/apertium/apertium-swe/apertium-swe.swe.dix apertium-swe.swe.dix] || 1895 || {{#lst:apertium-swe/stats|stems}}
|-
|-
| Danish || [http://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-sv-da/apertium-sv-da.da.dix apertium-sv-da.da.dix] || 409 || 1,359
| Danish || [https://github.com/apertium/apertium-dan/apertium-dan.dan.dix apertium-dan.dan.dix] || 713 || {{#lst:apertium-dan/stats|stems}}
|-
|-
| Faroese || [http://apertium.svn.sourceforge.net/svnroot/apertium/trunk/incubator/apertium-fo-is.fo.dix apertium-fo-is.fo.dix] || 113 || 1,864
| Icelandic || [https://github.com/apertium/apertium-isl/apertium-isl.isl.dix apertium-isl.isl.dix] || 1,881 || {{#lst:apertium-isl/stats|stems}}
|-
|-
| Icelandic || [http://apertium.svn.sourceforge.net/svnroot/apertium/trunk/incubator/apertium-fo-is.is.dix apertium-fo-is.is.dix] || 158 || 938
| Faroese || [https://github.com/apertium/apertium-fao/apertium-fao.fao.dix apertium-fao.fao.dix] || 113 || {{#lst:apertium-fao/stats|stems}}
|-
|-
|}
|}
(though see https://github.com/giellalt/lang-fao for a more comprehensive Faroese analyser from Giellatekno)


==Resources==
==Resources==
Line 27: Line 60:
Resources listed below will be useful in building machine translation systems for these languages.
Resources listed below will be useful in building machine translation systems for these languages.


;Bilingual
;Monolingual


{|class=wikitable
{|class=wikitable
! Language !! Resource !! Description !! See also
! Language pair !! Resource !! Description !! See also
|-
|-
|Icelandic&mdash;Danish || Apertium bidix with ~960 entries || ||
| Norwegian || [http://www.edd.uio.no/prosjekt/ordbanken/ Norsk ordbank] || Large >100,000 lemma morphological dictionary of both Nynorsk and Bokmål, GPL. || [[Norsk ordbank]], [[Norwegian]]
|-
|-
|Icelandic&mdash;Faroese || Apertium bidix with ~30 entries || ||
| Swedish || [http://w3.msi.vxu.se/~nivre/research/Talbanken05.html Talbanken] || A 300,000-word tree-bank: it is in XML, all words are nicely tagged with PAROLE-style tags. ||
|-
| Danish || [http://www.isv.cbs.dk/~mbk/treebank/ Danish Dependency Treebank] || Danish tree bank, 100,000-word, XML, PAROLE tagged, under the GPL. ||
|-
| Icelandic || || ||
|-
| Faroese || [http://apertium.svn.sourceforge.net/svnroot/apertium/trunk/incubator/apertium-fo-is.fo.rle apertium-fo-is.fo.rle] || A [[constraint grammar]] for morphological disambiguation with ~200 rules ||
|-
|-
|}
|}

;Bilingual


==Funding possibilities==
==Funding possibilities==
Line 70: Line 95:


[[Category:Languages]]
[[Category:Languages]]
[[Category:North Germanic languages]]
[[Category:Documentation in English]]

Latest revision as of 11:55, 9 November 2022

En français

The North Germanic languages include Danish (dan), Faroese (fao), Icelandic (isl), Norwegian (Nynorsk, nno and Bokmål, nob) and Swedish (swe). The languages are related with varying levels of mutual intelligibility. This group would make a nice group for Apertium systems.

There are also some interesting variants which lack navies and armies:


Status[edit]

Text in italic denotes an unreleased pair.

dan fao isl nor nob nno swe
dan fao-dan dan-nor dan-swe
fao fao-dan fao-isl fao-nor
isl fao-isl isl-swe
nor dan-nor fao-nor swe-nor
nob nno-nob
nno nno-nob
swe dan-swe isl-swe swe-nor


Existing[edit]

Dictionaries
See also: List of dictionaries
Language File Paradigms Lemmata
Norwegian Nynorsk apertium-nno.nno.dix 1243 182,497
Norwegian Bokmål apertium-nob.nob.dix 1335 246,281
Swedish apertium-swe.swe.dix 1895 138,490
Danish apertium-dan.dan.dix 713 52,133
Icelandic apertium-isl.isl.dix 1,881 8,770
Faroese apertium-fao.fao.dix 113 2,318

(though see https://github.com/giellalt/lang-fao for a more comprehensive Faroese analyser from Giellatekno)

Resources[edit]

Resources listed below will be useful in building machine translation systems for these languages.

Bilingual
Language pair Resource Description See also
Icelandic—Danish Apertium bidix with ~960 entries
Icelandic—Faroese Apertium bidix with ~30 entries

Funding possibilities[edit]

Samples[edit]

Language Text
Danish Alle mennesker er født frie og lige i værdighed og rettigheder. De er udstyret med fornuft og samvittighed, og de bør handle mod hverandre i en broderskabets ånd.
Norwegian (Bokmål) Alle mennesker er født frie og med samme menneskeverd og menneskerettigheter. De er utstyrt med fornuft og samvittighet og bør handle mot hverandre i brorskapets ånd.
Norwegian (Nynorsk) Alle menneske er fødde til fridom og med same menneskeverd og menneskerettar. Dei har fått fornuft og samvit og skal leve med kvarandre som brør.
Swedish Alla människor är födda fria och lika i värde och rättigheter. De har utrustats med förnuft och samvete och bör handla gentemot varandra i en anda av gemenskap.
Faroese Øll menniskju eru fødd fræls og jøvn til virðingar og mannarættindi. Tey hava skil og samvitsku og eiga at fara hvørt um annað í bróðuranda.
Icelandic Hver maður er borinn frjáls og jafn öðrum að virðingu og réttindum. Menn eru gæddir vitsmunum og samvizku, og ber þeim að breyta bróðurlega hverjum við annan.