Difference between revisions of "North Germanic languages"
		
		
		
		
		
		
		Jump to navigation
		Jump to search
		
				
		
		
		
		
		
		
		
	
| (20 intermediate revisions by 6 users not shown) | |||
| Line 1: | Line 1: | ||
| [[Langues nord germaniques|En français]] | |||
| {{TOCD}} | {{TOCD}} | ||
| The '''North Germanic languages''' include Danish (<code> | The '''North Germanic languages''' include Danish (<code>dan</code>), Faroese (<code>fao</code>), Icelandic (<code>isl</code>), Norwegian (Nynorsk, <code>nno</code> and Bokmål, <code>nob</code>) and Swedish (<code>swe</code>). The languages are related with varying levels of mutual intelligibility. This group would make a nice group for Apertium systems. | ||
| There are also some interesting variants which lack navies and armies: | |||
| * [https://en.wikipedia.org/wiki/Elfdalian Elfdalian] (<code>ovd</code>, [http://svn.code.sf.net/p/apertium/svn/incubator/apertium-ovd.ovd.dix tiny ovd.dix]) | |||
| * [https://en.wikipedia.org/wiki/Bornholm_dialect Bornholmsk] (<code>born1251</code> / <code>da-bornholm</code>) | |||
| ==Status== | |||
| Text in ''italic'' denotes an unreleased pair. | |||
| {| style="text-align: center;" class="wikitable" | |||
| |- style="background: #ececec" | |||
| ⚫ | |||
| ⚫ | |||
| | '''dan'''  || —     || ''[[fao-dan]]''   ||                 ||  [[dan-nor]]  ||   —    || —            || [[dan-swe]]    | |||
| ⚫ | |||
| | '''fao'''  || ''[[fao-dan]]''  || —         || ''[[fao-isl]]'' || ''[[fao-nor]]''              || —      || —            ||            | |||
| ⚫ | |||
| | '''isl'''  ||             || ''[[fao-isl]]'' || —         ||               || —      || —          || [[isl-swe]]         | |||
| ⚫ | |||
| | '''nor'''  || [[dan-nor]] || ''[[fao-nor]]''                ||                 ||  —      || —      || —            || [[swe-nor]] | |||
| ⚫ | |||
| | '''nob'''  || —     || —         || —         || —       || —      ||  [[nno-nob]] ||  —  | |||
| |-    | |||
| | '''nno'''  || —     || —         || —         || —       ||  [[nno-nob]]  || —     || —  | |||
| ⚫ | |||
| | '''swe'''  || [[dan-swe]] ||                 || [[isl-swe]]     || [[swe-nor]]    || —      ||   —           || —   | |||
| ⚫ | |||
| ⚫ | |||
| ==Existing== | ==Existing== | ||
| Line 9: | Line 41: | ||
| ! Language          !! File !! Paradigms !! Lemmata | ! Language          !! File !! Paradigms !! Lemmata | ||
| |- | |- | ||
| | Norwegian Nynorsk || [ | | Norwegian Nynorsk || [https://github.com/apertium/apertium-nno/apertium-nno.nno.dix apertium-nno.nno.dix] || 1243 || {{#lst:apertium-nno/stats|stems}} | ||
| |- | |- | ||
| | Norwegian Bokmål  || [ | | Norwegian Bokmål  || [https://github.com/apertium/apertium-nob/apertium-nob.nob.dix apertium-nob.nob.dix] || 1335 || {{#lst:apertium-nob/stats|stems}} | ||
| |- | |- | ||
| | Swedish           || [ | | Swedish           || [https://github.com/apertium/apertium-swe/apertium-swe.swe.dix apertium-swe.swe.dix] || 1895 || {{#lst:apertium-swe/stats|stems}} | ||
| |- | |- | ||
| | Danish            || [ | | Danish            || [https://github.com/apertium/apertium-dan/apertium-dan.dan.dix apertium-dan.dan.dix] || 713 || {{#lst:apertium-dan/stats|stems}} | ||
| |- | |- | ||
| |  | | Icelandic         || [https://github.com/apertium/apertium-isl/apertium-isl.isl.dix apertium-isl.isl.dix] || 1,881 || {{#lst:apertium-isl/stats|stems}} | ||
| |- | |- | ||
| |  | | Faroese           || [https://github.com/apertium/apertium-fao/apertium-fao.fao.dix apertium-fao.fao.dix] || 113 || {{#lst:apertium-fao/stats|stems}} | ||
| |- | |- | ||
| |} | |} | ||
| (though see https://github.com/giellalt/lang-fao for a more comprehensive Faroese analyser from Giellatekno) | |||
| ==Resources== | ==Resources== | ||
| Resources listed below will be useful in building machine translation systems for these languages. | Resources listed below will be useful in building machine translation systems for these languages. | ||
| ;Monolingual | |||
| {|class=wikitable | |||
| ⚫ | |||
| ⚫ | |||
| | Norwegian || [http://www.edd.uio.no/prosjekt/ordbanken/ Norsk ordbank] || Large >100,000 lemma morphological dictionary of both Nynorsk and Bokmål, GPL. || [[Norsk ordbank]], [[Norwegian]] | |||
| ⚫ | |||
| | Norwegian || [http://omilia.uio.no/obt/ Oslo-Bergen tagger] || Constraint grammar tagger for Norwegian, GPL. || [[Norwegian]] | |||
| ⚫ | |||
| | Swedish   || [http://w3.msi.vxu.se/~nivre/research/Talbanken05.html Talbanken] || A 300,000-word tree-bank: it is in XML, all words are nicely tagged with PAROLE-style tags. || | |||
| ⚫ | |||
| | Danish    || [http://www.isv.cbs.dk/~mbk/treebank/ Danish Dependency Treebank] || Danish tree bank, 100,000-word, XML, PAROLE tagged, under the GPL. || | |||
| ⚫ | |||
| | Icelandic ||  || || [[Icelandic and English]] | |||
| ⚫ | |||
| | Faroese   || [http://apertium.svn.sourceforge.net/svnroot/apertium/trunk/incubator/apertium-fo-is.fo.rle apertium-fo-is.fo.rle] || A [[constraint grammar]] for morphological disambiguation with ~120 rules ||  | |||
| ⚫ | |||
| ⚫ | |||
| ;Bilingual | ;Bilingual | ||
| Line 54: | Line 68: | ||
| |- | |- | ||
| |Icelandic—Faroese || Apertium bidix with ~30 entries || ||  | |Icelandic—Faroese || Apertium bidix with ~30 entries || ||  | ||
| |- | |||
| |Norwegian (Nynorsk)—Norwegian (Bokmål) || Apertium bidix with ~36,000 entries || ||  | |||
| |- | |- | ||
| |} | |} | ||
| Line 84: | Line 96: | ||
| [[Category:Languages]] | [[Category:Languages]] | ||
| [[Category:North Germanic languages]] | [[Category:North Germanic languages]] | ||
| [[Category:Documentation in English]] | |||
Latest revision as of 11:55, 9 November 2022
The North Germanic languages include Danish (dan), Faroese (fao), Icelandic (isl), Norwegian (Nynorsk, nno and Bokmål, nob) and Swedish (swe). The languages are related with varying levels of mutual intelligibility. This group would make a nice group for Apertium systems.
There are also some interesting variants which lack navies and armies:
- Elfdalian (ovd, tiny ovd.dix)
- Bornholmsk (born1251/da-bornholm)
Status[edit]
Text in italic denotes an unreleased pair.
| dan | fao | isl | nor | nob | nno | swe | |
|---|---|---|---|---|---|---|---|
| dan | — | fao-dan | dan-nor | — | — | dan-swe | |
| fao | fao-dan | — | fao-isl | fao-nor | — | — | |
| isl | fao-isl | — | — | — | isl-swe | ||
| nor | dan-nor | fao-nor | — | — | — | swe-nor | |
| nob | — | — | — | — | — | nno-nob | — | 
| nno | — | — | — | — | nno-nob | — | — | 
| swe | dan-swe | isl-swe | swe-nor | — | — | — | 
Existing[edit]
- Dictionaries
- See also: List of dictionaries
| Language | File | Paradigms | Lemmata | 
|---|---|---|---|
| Norwegian Nynorsk | apertium-nno.nno.dix | 1243 | 182,497 | 
| Norwegian Bokmål | apertium-nob.nob.dix | 1335 | 246,281 | 
| Swedish | apertium-swe.swe.dix | 1895 | 138,490 | 
| Danish | apertium-dan.dan.dix | 713 | 52,133 | 
| Icelandic | apertium-isl.isl.dix | 1,881 | 8,770 | 
| Faroese | apertium-fao.fao.dix | 113 | 2,318 | 
(though see https://github.com/giellalt/lang-fao for a more comprehensive Faroese analyser from Giellatekno)
Resources[edit]
Resources listed below will be useful in building machine translation systems for these languages.
- Bilingual
| Language pair | Resource | Description | See also | 
|---|---|---|---|
| Icelandic—Danish | Apertium bidix with ~960 entries | ||
| Icelandic—Faroese | Apertium bidix with ~30 entries | 
Funding possibilities[edit]
Samples[edit]
| Language | Text | 
|---|---|
| Danish | Alle mennesker er født frie og lige i værdighed og rettigheder. De er udstyret med fornuft og samvittighed, og de bør handle mod hverandre i en broderskabets ånd. | 
| Norwegian (Bokmål) | Alle mennesker er født frie og med samme menneskeverd og menneskerettigheter. De er utstyrt med fornuft og samvittighet og bør handle mot hverandre i brorskapets ånd. | 
| Norwegian (Nynorsk) | Alle menneske er fødde til fridom og med same menneskeverd og menneskerettar. Dei har fått fornuft og samvit og skal leve med kvarandre som brør. | 
| Swedish | Alla människor är födda fria och lika i värde och rättigheter. De har utrustats med förnuft och samvete och bör handla gentemot varandra i en anda av gemenskap. | 
| Faroese | Øll menniskju eru fødd fræls og jøvn til virðingar og mannarættindi. Tey hava skil og samvitsku og eiga at fara hvørt um annað í bróðuranda. | 
| Icelandic | Hver maður er borinn frjáls og jafn öðrum að virðingu og réttindum. Menn eru gæddir vitsmunum og samvizku, og ber þeim að breyta bróðurlega hverjum við annan. | 

