Difference between revisions of "User:Sushain/GermanicLanguages"
(30 intermediate revisions by one other user not shown) | |||
Line 1: | Line 1: | ||
{{TOCD}} |
{{TOCD}} |
||
The '''Germanic languages''' ([http://www.ethnologue.com/subgroups/germanic gem]) constitute a branch of the Indo-European language family spoken primarily in Europe, Anglo-America and Australasia. The common ancestor of all the languages is called Proto-Germanic, which was spoken approximately in the mid-1st millenium BC in Iron Age northern Europe. Of the over 50 different Germanic languages, the most widely spoken are [[English]], [[German]], and [[Dutch]] with over |
The '''Germanic languages''' ([http://www.ethnologue.com/subgroups/germanic gem]) constitute a branch of the Indo-European language family spoken primarily in Europe, Anglo-America and Australasia. The common ancestor of all the languages is called Proto-Germanic, which was spoken approximately in the mid-1st millenium BC in Iron Age northern Europe. Of the over 50 different Germanic languages, the most widely spoken are [[English]], [[German]], and [[Dutch]] with over 450 million speakers in total. |
||
The master plan involves generating independent finite-state transducers for each language, and then making individual dictionaries and transfer rules for every pair. The current status of these goals is listed below. |
The master plan involves generating independent finite-state transducers for each language, and then making individual dictionaries and transfer rules for every pair. The current status of these goals is listed below. |
||
Line 10: | Line 10: | ||
Once a transducer has ~80% coverage on a range of medium-large corpora we can say it is "working". Over 90% and it can be considered to be "production". |
Once a transducer has ~80% coverage on a range of medium-large corpora we can say it is "working". Over 90% and it can be considered to be "production". |
||
{| class="wikitable sortable" |
|||
===Germanic languages by subgroup=== |
|||
|- |
|||
!rowspan=2| name |
|||
* [[West Germanic]] |
|||
!rowspan=2| language |
|||
** [[High German]] |
|||
!rowspan=2| native name |
|||
*** Upper German |
|||
!colspan=2 class="unsortable"| ISO 639 |
|||
*** [[Yiddish]] |
|||
!rowspan=2| formalism |
|||
*** Central German |
|||
!rowspan=2| state |
|||
**** East Central German |
|||
!rowspan=2| stems |
|||
**** West Central German |
|||
!rowspan=2| paradigms |
|||
***** [[Luxembourgish]] |
|||
!rowspan=2| coverage |
|||
***** [[Pennsylvania German language]] |
|||
!rowspan=2| location |
|||
** Low German |
|||
!rowspan=2 class="unsortable"| primary authors |
|||
** Low Franconian |
|||
|-class="sortbottom" |
|||
*** [[Dutch]] |
|||
! -2 |
|||
*** [[Afrikaans]] |
|||
! -3 |
|||
** [[Anglo-Frisian languages]] |
|||
|- |
|||
*** [[Frisian languages]] |
|||
| <code>[[apertium-nno]]</code> |
|||
*** [[English languages]] |
|||
|| [[Nynorsk]] |
|||
|| nynorsk |
|||
**** [[Scots]] |
|||
||<code>nn</code> |
|||
**** Yola (extinct) |
|||
|| <code>nno</code> |
|||
* [[North Germanic]] |
|||
|| [[lttoolbox]] |
|||
** West Scandinavian |
|||
|| production |
|||
*** [[Norwegian]] |
|||
|align="right"| {{#lst:Apertium-nno/stats|stems}} |
|||
*** [[Icelandic]] |
|||
|align="right"| {{#lst:Apertium-nno/stats|paradigms}} |
|||
*** [[Faroese]] |
|||
|align="center"| |
|||
*** Greenlandic Norse (extinct) |
|||
|| [[apertium-nno]] ([[languages]]) |
|||
*** Norn (extinct) |
|||
|| [[User:Francis_Tyers|Fran]], [[User:Trondtr|Trondtr]], [[User:Unhammer|Unhammer]] |
|||
** East Scandinavian |
|||
|- |
|||
*** [[Danish]] |
|||
| <code>[[apertium-nob]]</code> |
|||
*** [[Swedish]] |
|||
|| [[Bokmål]] |
|||
|| bokmål |
|||
||<code>nb</code> |
|||
|| <code>nob</code> |
|||
|| [[lttoolbox]] |
|||
|| production |
|||
|align="right"| {{#lst:Apertium-nob/stats|stems}} |
|||
|align="right"| {{#lst:Apertium-nob/stats|paradigms}} |
|||
|align="center"| |
|||
|| [[apertium-nob]] ([[languages]]) |
|||
|| [[User:Francis_Tyers|Fran]], [[User:Trondtr|Trondtr]], [[User:Unhammer|Unhammer]] |
|||
|- |
|||
| <code>[[apertium-dan]]</code> |
|||
|| [[Danish]] |
|||
|| dansk |
|||
||<code>da</code> |
|||
|| <code>dan</code> |
|||
|| [[lttoolbox]] |
|||
|| production |
|||
|align="right"| {{#lst:Apertium-dan/stats|stems}} |
|||
|align="right"| {{#lst:Apertium-dan/stats|paradigms}} |
|||
|align="center"| |
|||
|| [[apertium-dan]] ([[languages]]) |
|||
|| [[User:Francis_Tyers|Fran]], [[User:Jacob Nordfalk|JacobEo]], Jonas |
|||
|- |
|||
| <code>[[apertium-eng]]</code> |
|||
|| [[English]] |
|||
|| English |
|||
||<code>en</code> |
|||
|| <code>eng</code> |
|||
|| [[lttoolbox]] |
|||
|| production |
|||
|align="right"| {{#lst:Apertium-is-en/stats|en-stems}} |
|||
|align="right"| {{#lst:Apertium-is-en/stats|en-paradigms}} |
|||
|align="center"| |
|||
|| [[apertium-is-en]] ([[trunk]]) - [[Apertium-eng/stats#English_dix.27s_in_trunk|?]] |
|||
|| [[User:Francis_Tyers|Fran]], marthab08, hrafn65, hloftsson, olafurw |
|||
|- |
|||
| <code>[[apertium-nld]]</code> |
|||
|| [[Dutch]] |
|||
|| Nederlands |
|||
||<code>nl</code> |
|||
|| <code>nld</code> |
|||
|| [[lttoolbox]] |
|||
|| production |
|||
|align="right"| {{#lst:Apertium-nld/stats|stems}} |
|||
|align="right"| {{#lst:Apertium-nld/stats|paradigms}} |
|||
|align="center"| |
|||
|| [[apertium-nld]] ([[languages]]) |
|||
|| [[User:Francis_Tyers|Fran]], Teirlynck, Otte, Naudé |
|||
|- |
|||
| <code>[[apertium-afr]]</code> |
|||
|| [[Afrikaans]] |
|||
|| Afrikaans |
|||
||<code>af</code> |
|||
|| <code>afr</code> |
|||
|| [[lttoolbox]] |
|||
|| production |
|||
|align="right"| {{#lst:Apertium-en-af/stats|af-stems}} |
|||
|align="right"| {{#lst:Apertium-en-af/stats|af-paradigms}} |
|||
|align="center"| |
|||
|| [[apertium-en-af]] ([[staging]]) |
|||
|| [[User:Francis_Tyers|Fran]], winterstream |
|||
|- |
|||
| <code>[[apertium-deu]]</code> |
|||
|| [[German]] |
|||
|| Deutsch |
|||
||<code>de</code> |
|||
|| <code>deu</code> |
|||
|| [[lttoolbox]] |
|||
|| working |
|||
|align="right"| {{#lst:Apertium-deu/stats|stems}} |
|||
|align="right"| {{#lst:Apertium-deu/stats|paradigms}} |
|||
|align="center"| |
|||
|| [[apertium-deu]] ([[incubator]]) |
|||
|| [[User:Francis_Tyers|Fran]], ebenimeli, Jim Regan |
|||
|- |
|||
| <code>[[apertium-swe]]</code> |
|||
|| [[Swedish]] |
|||
|| svenska |
|||
||<code>sv</code> |
|||
|| <code>swe</code> |
|||
|| [[lttoolbox]] |
|||
|| working |
|||
|align="right"| {{#lst:Apertium-swe/stats|stems}} |
|||
|align="right"| {{#lst:Apertium-swe/stats|paradigms}} |
|||
|align="center"| |
|||
|| [[apertium-swe]] ([[languages]]) |
|||
|| ? |
|||
|- |
|||
| <code>[[apertium-isl]]</code> |
|||
|| [[Icelandic]] |
|||
|| íslenska |
|||
||<code>is</code> |
|||
|| <code>isl</code> |
|||
|| [[lttoolbox]] |
|||
|| development |
|||
|align="right"| {{#lst:Apertium-isl/stats|stems}} |
|||
|align="right"| {{#lst:Apertium-isl/stats|paradigms}} |
|||
|align="center"| |
|||
|| [[apertium-isl]] ([[languages]]) |
|||
|| [[User:Francis_Tyers|Fran]], Loftsson, Brandt, Sigurþórsson |
|||
|- |
|||
| <code>[[apertium-sco]]</code> |
|||
|| [[Scots]] |
|||
|| Scots |
|||
||<code>-</code> |
|||
|| <code>sco</code> |
|||
|| [[lttoolbox]] |
|||
|| development |
|||
|align="right"| {{#lst:Apertium-eng-sco/stats|sco-stems}} |
|||
|align="right"| {{#lst:Apertium-eng-sco/stats|sco-paradigms}} |
|||
|align="center"| |
|||
|| [[apertium-eng-sco]] ([[incubator]]) |
|||
|| Jim Regan |
|||
|- |
|||
| <code>[[apertium-fao]]</code> |
|||
|| [[Faroese]] |
|||
|| føroyskt |
|||
||<code>fo</code> |
|||
|| <code>fao</code> |
|||
|| [[lttoolbox]] |
|||
|| development |
|||
|align="right"| {{#lst:Apertium-fao/stats|stems}} |
|||
|align="right"| {{#lst:Apertium-fao/stats|paradigms}} |
|||
|align="center"| |
|||
|| [[apertium-fao]] ([[languages]]) |
|||
|| [[User:Trondtr|Trondtr]] |
|||
|- |
|||
| <code>[[apertium-fry]]</code> |
|||
|| [[West Frisian]] |
|||
|| Frysk |
|||
||<code>fy</code> |
|||
|| <code>fry</code> |
|||
|| [[lttoolbox]] |
|||
|| prototype |
|||
|align="right"| {{#lst:Apertium-nld-fry/stats|fry-stems}} |
|||
|align="right"| {{#lst:Apertium-nld-fry/stats|fry-paradigms}} |
|||
|align="center"| |
|||
|| [[apertium-nld-fry]] ([[nursery]]) |
|||
|| [[User:Francis_Tyers|Fran]] |
|||
|} |
|||
===Pairs=== |
===Pairs=== |
||
Line 51: | Line 192: | ||
* [[Icelandic]] and [[Faroese]] |
* [[Icelandic]] and [[Faroese]] |
||
* [[Swedish]], [[Danish]], [[ Norwegian (Bokmål)]], [[Norwegian (Nynorsk)]] |
* [[Swedish]], [[Danish]], [[ Norwegian (Bokmål)]], [[Norwegian (Nynorsk)]] |
||
* [[English]] and [[Scots]] |
|||
====Table of Existing Pairs==== |
====Table of Existing Pairs==== |
||
Text in ''italics'' denotes language pairs in the incubator. Regular text denotes a developing language pair in |
Text in ''italics'' denotes language pairs in the incubator. Regular text denotes a developing language pair in nursery, while text in '''bold''' denotes a stable well-working language pair in trunk and text in '''''bold and italics''''' denotes a pair in staging. Bidix stems as counted with [[dixcounter]] are displayed below. |
||
{| style="text-align: center;" class="wikitable" |
{| style="text-align: center;" class="wikitable" |
||
|- style="background: #ececec" |
|- style="background: #ececec" |
||
! !! dan !! nor !! swe !! fao !! isl !! deu !! nld !! afr !! fry !! eng |
! !! dan !! nor !! swe !! fao !! isl !! deu !! nld !! afr !! fry !! eng !! nob !! nno !! sco |
||
|- |
|- |
||
| '''dan''' || - || '''[[Apertium-dan-nor|dan-nor]]'''<br>''' |
| '''dan''' || - || '''[[Apertium-dan-nor|dan-nor]]'''<br>'''�_stems�''' || '''[[Apertium-sv-da|sv-da]]'''<br>'''�_stems�''' || ''[[Apertium-da-fo|da-fo]]''<br>�_stems� || ''[[Apertium-isl-dan|isl-dan]]''<br>�_stems� || || || || || ''[[Apertium-da-en|da-en]]''<br>�_stems� || || || |
||
|- |
|- |
||
| '''nor''' || '''[[Apertium-dan-nor|dan-nor]]'''<br>''' |
| '''nor''' || '''[[Apertium-dan-nor|dan-nor]]'''<br>'''�_stems�''' || - || ''[[Apertium-swe-nor|swe-nor]]''<br>�_stems� || || || || || || || [[Apertium-nor-eng|nor-eng]]<br>�_stems� || || || |
||
|- |
|- |
||
| '''swe''' || '''[[Apertium-sv-da|sv-da]]'''<br>''' |
| '''swe''' || '''[[Apertium-sv-da|sv-da]]'''<br>'''�_stems�''' || ''[[Apertium-swe-nor|swe-nor]]''<br>�_stems� || - || || '''[[Apertium-is-sv|is-sv]]'''<br>'''�_stems�''' || ''[[Apertium-deu-swe|deu-swe]]''<br>�_stems� || || || || || ''[[Apertium-sv-nb|sv-nb]]''<br>�_stems� || || |
||
|- |
|- |
||
| '''fao''' || ''[[Apertium-da-fo|da-fo]]''<br> |
| '''fao''' || ''[[Apertium-da-fo|da-fo]]''<br>�_stems� || || || - || [[Apertium-fo-is|fo-is]]<br>�_stems� || || || || || || ''[[Apertium-fo-nb|fo-nb]]''<br>�_stems� || || |
||
|- |
|- |
||
| '''isl''' || ''[[Apertium-isl-dan|isl-dan]]''<br> |
| '''isl''' || ''[[Apertium-isl-dan|isl-dan]]''<br>�_stems� || || '''[[Apertium-is-sv|is-sv]]'''<br>'''�_stems�''' || [[Apertium-fo-is|fo-is]]<br>�_stems� || - || || || || || '''[[Apertium-is-en|is-en]]'''<br>'''�_stems�''' || || || |
||
|| || || || '''[[Apertium-is-en|is-en]]'''<br>'''8,699''' |
|||
|- |
|- |
||
| '''deu''' || || || ''[[Apertium-deu-swe|deu-swe]]''<br> |
| '''deu''' || || || ''[[Apertium-deu-swe|deu-swe]]''<br>�_stems� || || || - || ''[[Apertium-de-nl|de-nl]]''<br>�_stems� || || || ''[[Apertium-en-de|en-de]]''<br>�_stems� || || || |
||
|- |
|- |
||
| '''nld''' || || || || || || ''[[Apertium-de-nl|de-nl]]''<br> |
| '''nld''' || || || || || || ''[[Apertium-de-nl|de-nl]]''<br>�_stems� || - || '''[[Apertium-af-nl|af-nl]]'''<br>'''�_stems�''' || [[Apertium-nld-fry|nld-fry]]<br>�_stems� || ''[[Apertium-en-nl|en-nl]]''<br>�_stems� || || || |
||
|- |
|- |
||
| '''afr''' || || || || || || || '''[[Apertium-af-nl|af-nl]]'''<br>''' |
| '''afr''' || || || || || || || '''[[Apertium-af-nl|af-nl]]'''<br>'''�_stems�''' || - || || [[Apertium-en-af|en-af]]<br>�_stems� || || || |
||
|- |
|- |
||
| '''fry''' || || || || || || || [[Apertium-nld-fry|nld-fry]]<br> |
| '''fry''' || || || || || || || [[Apertium-nld-fry|nld-fry]]<br>�_stems� || || - || || || || |
||
|- |
|- |
||
| '''eng''' || ''[[Apertium-da-en|da-en]]''<br> |
| '''eng''' || ''[[Apertium-da-en|da-en]]''<br>�_stems� || [[Apertium-nor-eng|nor-eng]]<br>�_stems� || || || '''[[Apertium-is-en|is-en]]'''<br>'''�_stems�''' || ''[[Apertium-en-de|en-de]]''<br>�_stems� || ''[[Apertium-en-nl|en-nl]]''<br>�_stems� || [[Apertium-en-af|en-af]]<br>�_stems� || || - || || || ''[[Apertium-eng-sco|eng-sco]]''<br>�_stems� |
||
|- |
|- |
||
| || || || || || || || || || || |
| '''nob''' || || || ''[[Apertium-sv-nb|sv-nb]]''<br>�_stems� || ''[[Apertium-fo-nb|fo-nb]]''<br>�_stems� || || || || || || || - || '''[[Apertium-nn-nb|nn-nb]]'''<br>'''�_stems�''' || |
||
|- |
|- |
||
| ''' |
| '''nno''' || || || || || || || || || || || '''[[Apertium-nn-nb|nn-nb]]'''<br>'''�_stems�''' || - || |
||
|- |
|- |
||
| ''' |
| '''sco''' || || || || || || || || || || ''[[Apertium-eng-sco|eng-sco]]''<br>�_stems� || || || - |
||
|- |
|- |
||
| |
| || || || || || || || || || || || || || |
||
|- |
|- |
||
| ''' |
| '''ben''' || || || || || || || || || || ''[[Apertium-bn-en|bn-en]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''bul''' || || || || || || || || || || [[Apertium-bg-en|bg-en]]<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''cat''' || || || || || || || || || || '''[[Apertium-en-ca|en-ca]]'''<br>'''�_stems�''' || || || |
||
|- |
|- |
||
| ''' |
| '''cym''' || || || || || || || || || || '''[[Apertium-cy-en|cy-en]]'''<br>'''�_stems�''' || || || |
||
|- |
|- |
||
| ''' |
| '''ell''' || || || || || || || || || || ''[[Apertium-ell-eng|ell-eng]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''epo''' || || || ''[[Apertium-eo-sv|eo-sv]]''<br>�_stems� || || || ''[[Apertium-eo-de|eo-de]]''<br>�_stems� || ''[[Apertium-eo-nl|eo-nl]]''<br>�_stems� || || || '''[[Apertium-eo-en|eo-en]]'''<br>'''�_stems�''' || || || |
||
|- |
|- |
||
| ''' |
| '''eus''' || || || || || || || || || || '''[[Apertium-eu-en|eu-en]]'''<br>'''�_stems�''' || || || |
||
|- |
|- |
||
| ''' |
| '''fin''' || || || || || || || || || || [[Apertium-fin-eng|fin-eng]]<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''fra''' || || || || || || || ''[[Apertium-fr-nl|fr-nl]]''<br>�_stems� || || || ''[[Apertium-en-fr|en-fr]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''gle''' || || || || || || || || || || ''[[Apertium-en-ga|en-ga]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''glg''' || || || || || || || || || || '''[[Apertium-en-gl|en-gl]]'''<br>'''�_stems�''' || || || |
||
|- |
|- |
||
| ''' |
| '''hat''' || || || || || || || || || || ''[[Apertium-ht-en|ht-en]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''hbs''' || || || || || || || || || || ''[[Apertium-sh-en|sh-en]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''hin''' || || || || || || || || || || [[Apertium-eng-hin|eng-hin]]<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''hun''' || || || || || || || || || || ''[[Apertium-hun-eng|hun-eng]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''hye''' || || || || || || || || || || [[Apertium-hye-eng|hye-eng]]<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''ita''' || || || || || || || || || || ''[[Apertium-en-it|en-it]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''kaz''' || || || || || || || || || || ''[[Apertium-eng-kaz|eng-kaz]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''kir''' || || || || || || || || || || ''[[Apertium-ky-en|ky-en]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''lat''' || || || || || || || || || || ''[[Apertium-la-en|la-en]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''lav''' || || || || || || || || || || ''[[Apertium-en-lv|en-lv]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''lit''' || || || || || || || || || || ''[[Apertium-en-lt|en-lt]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''ltz''' || || || || || || ''[[Apertium-deu-ltz|deu-ltz]]''<br>�_stems� || || || || || || || |
||
|- |
|- |
||
| ''' |
| '''lvs''' || || || || || || || || || || ''[[Apertium-eng-lvs|eng-lvs]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''mal''' || || || || || || || || || || ''[[Apertium-mal-eng|mal-eng]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''mar''' || || || || || || || || || || ''[[Apertium-mar-eng|mar-eng]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''mfe''' || || || || || || || || || || ''[[Apertium-mfe-en|mfe-en]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''mkd''' || || || || || || || || || || '''[[Apertium-mk-en|mk-en]]'''<br>'''�_stems�''' || || || |
||
|- |
|- |
||
| ''' |
| '''mlt''' || || || || || || || || || || ''[[Apertium-en-mt|en-mt]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''nep''' || || || || || || || || || || ''[[Apertium-ne-en|ne-en]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''pol''' || || || || || || || || || || ''[[Apertium-en-pl|en-pl]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''por''' || || || || || || || || || || [[Apertium-en-pt|en-pt]]<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''rus''' || || || || || || || || || || ''[[Apertium-ru-en|ru-en]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''sin''' || || || || || || || || || || ''[[Apertium-si-en|si-en]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''sme''' || || || || || || ''[[Apertium-sme-deu|sme-deu]]''<br>�_stems� || || || || || '''[[Apertium-sme-nob|sme-nob]]'''<br>'''�_stems�''' || || |
||
|- |
|- |
||
| ''' |
| '''spa''' || || || || || || ''[[Apertium-es-de|es-de]]''<br>�_stems� || || || || '''[[Apertium-en-es|en-es]]'''<br>'''�_stems�''' || || || |
||
|- |
|- |
||
| ''' |
| '''sqi''' || || || || || || || || || || ''[[Apertium-en-sq|en-sq]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''tel''' || || || || || || || || || || ''[[Apertium-eng-tel|eng-tel]]''<br>�_stems� || || || |
||
|- |
|- |
||
| ''' |
| '''tur''' || || || || || || || || || || ''[[Apertium-tr-en|tr-en]]''<br>�_stems� || || || |
||
|} |
|||
{|class=wikitable |
|||
! !!de!!l!!nd!!nl!!af!!fy!!is!!fo!!sv!!da!!no |
|||
|- |
|||
|'''de'''|| - ||de-l|| || || || || || || || || |
|||
|- |
|||
|'''l'''||l-de|| - || || || || || || || || || |
|||
|- |
|||
|'''nd'''|| || || - ||nd-nl||nd-af||nd-fy|| || || || || |
|||
|- |
|||
|'''nl'''|| || ||nl-nd|| - ||nl-af||nl-fy|| || || || || |
|||
|- |
|||
|'''af'''|| || ||af-nd||af-nl|| - ||af-fy|| || || || || |
|||
|- |
|||
|'''fy'''|| || ||fy-nd||fy-nl||fy-af|| - || || || || || |
|||
|- |
|||
|'''is'''|| || || || || || || - ||is-fo|| || || |
|||
|- |
|||
|'''fo'''|| || || || || || ||fo-is|| - || || || |
|||
|- |
|||
|'''sv'''|| || || || || || || || || - ||sv-da||sv-no |
|||
|- |
|||
|'''da'''|| || || || || || || || ||da-sv|| - ||da-no |
|||
|- |
|||
|'''no'''|| || || || || || || || ||no-sv||no-da|| - |
|||
|- |
|- |
||
| '''vie''' || || || || || || || || || || ''[[Apertium-vi-en|vi-en]]''<br>�_stems� || || || |
|||
|} |
|} |
||
==Classification== |
|||
All living Germanic languages belong either to the West Germanic or to the North Germanic branch: |
|||
* [[West Germanic]] |
|||
** [[High German]]: [[German]], [[Yiddish]], [[Luxembourgish]] |
|||
** Low German: West Low German, East Low German |
|||
** Low Franconian: [[Dutch]], [[Afrikaans]] |
|||
** [[Anglo-Frisian languages]]: |
|||
*** English group: [[English]], [[Scots]] |
|||
*** Frisian group: [[West Frisian]], Saterland Frisian, North Frisian |
|||
* [[North Germanic]] |
|||
** West Scandinavian: [[Norwegian]], [[Icelandic]], [[Faroese]] |
|||
** East Scandinavian: [[Danish]], [[Swedish]] |
|||
==Samples== |
==Samples== |
||
Line 215: | Line 346: | ||
|| Luxembourgeois || All Mënsch kënnt fräi a mat deer selwechter Dignitéit an dene selwechte Rechter op d'Welt. Jiddereen huet säi Verstand a säi Gewësse krut an soll an engem Geescht vu Bridderlechkeet denen anere géintiwwer handelen. |
|| Luxembourgeois || All Mënsch kënnt fräi a mat deer selwechter Dignitéit an dene selwechte Rechter op d'Welt. Jiddereen huet säi Verstand a säi Gewësse krut an soll an engem Geescht vu Bridderlechkeet denen anere géintiwwer handelen. |
||
|- |
|- |
||
|| Yiddish, Eastern |
|||
|| Yiddish, Eastern || יעדער מענטש װערט געבױרן פֿרײַ און גלײַך אין כּבֿוד און רעכט. יעדער װערט באַשאָנקן מיט פֿאַרשטאַנד און געװיסן; יעדער זאָל זיך פֿירן מיט אַ צװײטן אין אַ געמיט פֿון ברודערשאַפֿט. |
|||
|align="right"| יעדער מענטש װערט געבױרן פֿרײַ און גלײַך אין כּבֿוד און רעכט. יעדער װערט באַשאָנקן מיט פֿאַרשטאַנד און געװיסן; יעדער זאָל זיך פֿירן מיט אַ צװײטן אין אַ געמיט פֿון ברודערשאַפֿט. |
|||
|- |
|- |
||
|| Afrikaans || Alle menslike wesens word vry, met gelyke waardigheid en regte, gebore. Hulle het rede en gewete en behoort in die gees van broederskap teenoor mekaar op te tree. |
|| Afrikaans || Alle menslike wesens word vry, met gelyke waardigheid en regte, gebore. Hulle het rede en gewete en behoort in die gees van broederskap teenoor mekaar op te tree. |
||
Line 222: | Line 354: | ||
|- |
|- |
||
|| Saxon, Low || All de Minschen sünd frie un gliek an Wüürd un Rechten baren. Se hebbt Vernunft un een Geweten un se schüllt sik Bröder sien. |
|| Saxon, Low || All de Minschen sünd frie un gliek an Wüürd un Rechten baren. Se hebbt Vernunft un een Geweten un se schüllt sik Bröder sien. |
||
|- |
|||
|| Norwegian, Nynorsk || Alle menneske er fødde til fridom og med same menneskeverd og menneskerettar. Dei har fått fornuft og samvit og skal leve med kvarandre som brør. |
|||
|- |
|||
|| Norwegian, Bokmaal || Alle mennesker er født frie og med samme menneskeverd og menneskerettigheter. De er utstyrt med fornuft og samvittighet og bør handle mot hverandre i brorskapets ånd. |
|||
|} |
|} |
||
Latest revision as of 07:04, 14 December 2014
The Germanic languages (gem) constitute a branch of the Indo-European language family spoken primarily in Europe, Anglo-America and Australasia. The common ancestor of all the languages is called Proto-Germanic, which was spoken approximately in the mid-1st millenium BC in Iron Age northern Europe. Of the over 50 different Germanic languages, the most widely spoken are English, German, and Dutch with over 450 million speakers in total.
The master plan involves generating independent finite-state transducers for each language, and then making individual dictionaries and transfer rules for every pair. The current status of these goals is listed below.
Status[edit]
The ultimate goal is to have multi-purposable transducers for a variety of Germanic languages. These can then be paired for X→Y translation with the addition of a CG for language X and transfer rules / dictionary for the pair X→Y. Below is listed development progress for each language's transducers and dictionary pairs.
Transducers[edit]
Once a transducer has ~80% coverage on a range of medium-large corpora we can say it is "working". Over 90% and it can be considered to be "production".
name | language | native name | ISO 639 | formalism | state | stems | paradigms | coverage | location | primary authors | |
---|---|---|---|---|---|---|---|---|---|---|---|
-2 | -3 | ||||||||||
apertium-nno
|
Nynorsk | nynorsk | nn
|
nno
|
lttoolbox | production | 182,497 | 1,192 | apertium-nno (languages) | Fran, Trondtr, Unhammer | |
apertium-nob
|
Bokmål | bokmål | nb
|
nob
|
lttoolbox | production | 246,281 | 1,194 | apertium-nob (languages) | Fran, Trondtr, Unhammer | |
apertium-dan
|
Danish | dansk | da
|
dan
|
lttoolbox | production | 52,133 | 626 | apertium-dan (languages) | Fran, JacobEo, Jonas | |
apertium-eng
|
English | English | en
|
eng
|
lttoolbox | production | apertium-is-en (trunk) - ? | Fran, marthab08, hrafn65, hloftsson, olafurw | |||
apertium-nld
|
Dutch | Nederlands | nl
|
nld
|
lttoolbox | production | 25,079 | 1,095 | apertium-nld (languages) | Fran, Teirlynck, Otte, Naudé | |
apertium-afr
|
Afrikaans | Afrikaans | af
|
afr
|
lttoolbox | production | apertium-en-af (staging) | Fran, winterstream | |||
apertium-deu
|
German | Deutsch | de
|
deu
|
lttoolbox | working | 74,339 | 1,427 | apertium-deu (incubator) | Fran, ebenimeli, Jim Regan | |
apertium-swe
|
Swedish | svenska | sv
|
swe
|
lttoolbox | working | 138,490 | 1,834 | apertium-swe (languages) | ? | |
apertium-isl
|
Icelandic | íslenska | is
|
isl
|
lttoolbox | development | 8,770 | 1,878 | apertium-isl (languages) | Fran, Loftsson, Brandt, Sigurþórsson | |
apertium-sco
|
Scots | Scots | -
|
sco
|
lttoolbox | development | apertium-eng-sco (incubator) | Jim Regan | |||
apertium-fao
|
Faroese | føroyskt | fo
|
fao
|
lttoolbox | development | 2,318 | 278 | apertium-fao (languages) | Trondtr | |
apertium-fry
|
West Frisian | Frysk | fy
|
fry
|
lttoolbox | prototype | apertium-nld-fry (nursery) | Fran |
Pairs[edit]
Some Germanic languages that are particularly similar to one another (and hence have high levels of mutual intelligibility) include those in the following list:
- German and Luxemburgish
- Low German, Dutch, Afrikaans and West Frisian
- Icelandic and Faroese
- Swedish, Danish, Norwegian (Bokmål), Norwegian (Nynorsk)
- English and Scots
Table of Existing Pairs[edit]
Text in italics denotes language pairs in the incubator. Regular text denotes a developing language pair in nursery, while text in bold denotes a stable well-working language pair in trunk and text in bold and italics denotes a pair in staging. Bidix stems as counted with dixcounter are displayed below.
dan | nor | swe | fao | isl | deu | nld | afr | fry | eng | nob | nno | sco | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
dan | - | dan-nor �_stems� |
sv-da �_stems� |
da-fo �_stems� |
isl-dan �_stems� |
da-en �_stems� |
|||||||
nor | dan-nor �_stems� |
- | swe-nor �_stems� |
nor-eng �_stems� |
|||||||||
swe | sv-da �_stems� |
swe-nor �_stems� |
- | is-sv �_stems� |
deu-swe �_stems� |
sv-nb �_stems� |
|||||||
fao | da-fo �_stems� |
- | fo-is �_stems� |
fo-nb �_stems� |
|||||||||
isl | isl-dan �_stems� |
is-sv �_stems� |
fo-is �_stems� |
- | is-en �_stems� |
||||||||
deu | deu-swe �_stems� |
- | de-nl �_stems� |
en-de �_stems� |
|||||||||
nld | de-nl �_stems� |
- | af-nl �_stems� |
nld-fry �_stems� |
en-nl �_stems� |
||||||||
afr | af-nl �_stems� |
- | en-af �_stems� |
||||||||||
fry | nld-fry �_stems� |
- | |||||||||||
eng | da-en �_stems� |
nor-eng �_stems� |
is-en �_stems� |
en-de �_stems� |
en-nl �_stems� |
en-af �_stems� |
- | eng-sco �_stems� | |||||
nob | sv-nb �_stems� |
fo-nb �_stems� |
- | nn-nb �_stems� |
|||||||||
nno | nn-nb �_stems� |
- | |||||||||||
sco | eng-sco �_stems� |
- | |||||||||||
ben | bn-en �_stems� |
||||||||||||
bul | bg-en �_stems� |
||||||||||||
cat | en-ca �_stems� |
||||||||||||
cym | cy-en �_stems� |
||||||||||||
ell | ell-eng �_stems� |
||||||||||||
epo | eo-sv �_stems� |
eo-de �_stems� |
eo-nl �_stems� |
eo-en �_stems� |
|||||||||
eus | eu-en �_stems� |
||||||||||||
fin | fin-eng �_stems� |
||||||||||||
fra | fr-nl �_stems� |
en-fr �_stems� |
|||||||||||
gle | en-ga �_stems� |
||||||||||||
glg | en-gl �_stems� |
||||||||||||
hat | ht-en �_stems� |
||||||||||||
hbs | sh-en �_stems� |
||||||||||||
hin | eng-hin �_stems� |
||||||||||||
hun | hun-eng �_stems� |
||||||||||||
hye | hye-eng �_stems� |
||||||||||||
ita | en-it �_stems� |
||||||||||||
kaz | eng-kaz �_stems� |
||||||||||||
kir | ky-en �_stems� |
||||||||||||
lat | la-en �_stems� |
||||||||||||
lav | en-lv �_stems� |
||||||||||||
lit | en-lt �_stems� |
||||||||||||
ltz | deu-ltz �_stems� |
||||||||||||
lvs | eng-lvs �_stems� |
||||||||||||
mal | mal-eng �_stems� |
||||||||||||
mar | mar-eng �_stems� |
||||||||||||
mfe | mfe-en �_stems� |
||||||||||||
mkd | mk-en �_stems� |
||||||||||||
mlt | en-mt �_stems� |
||||||||||||
nep | ne-en �_stems� |
||||||||||||
pol | en-pl �_stems� |
||||||||||||
por | en-pt �_stems� |
||||||||||||
rus | ru-en �_stems� |
||||||||||||
sin | si-en �_stems� |
||||||||||||
sme | sme-deu �_stems� |
sme-nob �_stems� |
|||||||||||
spa | es-de �_stems� |
en-es �_stems� |
|||||||||||
sqi | en-sq �_stems� |
||||||||||||
tel | eng-tel �_stems� |
||||||||||||
tur | tr-en �_stems� |
||||||||||||
vie | vi-en �_stems� |
Classification[edit]
All living Germanic languages belong either to the West Germanic or to the North Germanic branch:
- West Germanic
- High German: German, Yiddish, Luxembourgish
- Low German: West Low German, East Low German
- Low Franconian: Dutch, Afrikaans
- Anglo-Frisian languages:
- English group: English, Scots
- Frisian group: West Frisian, Saterland Frisian, North Frisian
- North Germanic
Samples[edit]
Article 1 of the Universal Declaration of Human Rights:
All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.
Language | Text |
---|---|
Danish | Alle mennesker er født frie og lige i værdighed og rettigheder. De er udstyret med fornuft og samvittighed, og de bør handle mod hverandre i en broderskabets ånd. |
Swedish | Alla människor äro födda fria och lika i värde och rättigheter. De äro utrustade med förnuft och samvete och böra handla gentemot varandra i en anda av broderskap. |
Faroese | Øll menniskju eru fødd fræls og jøvn til virðingar og mannarættindi. Tey hava skil og samvitsku og eiga at fara hvørt um annað í bróðuranda. |
Icelandic | Hver maður er borinn frjáls og jafn öðrum að virðingu og réttindum. Menn eru gæddir vitsmunum og samvizku, og ber þeim að breyta bróðurlega hverjum við annan. |
English | All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood. |
Scots | Aw human sowels is born free and equal in dignity and richts. They are tochered wi mense and conscience and shuld guide theirsels ane til ither in a speirit o britherheid. |
Luxembourgeois | All Mënsch kënnt fräi a mat deer selwechter Dignitéit an dene selwechte Rechter op d'Welt. Jiddereen huet säi Verstand a säi Gewësse krut an soll an engem Geescht vu Bridderlechkeet denen anere géintiwwer handelen. |
Yiddish, Eastern | יעדער מענטש װערט געבױרן פֿרײַ און גלײַך אין כּבֿוד און רעכט. יעדער װערט באַשאָנקן מיט פֿאַרשטאַנד און געװיסן; יעדער זאָל זיך פֿירן מיט אַ צװײטן אין אַ געמיט פֿון ברודערשאַפֿט. |
Afrikaans | Alle menslike wesens word vry, met gelyke waardigheid en regte, gebore. Hulle het rede en gewete en behoort in die gees van broederskap teenoor mekaar op te tree. |
Dutch | Alle mensen worden vrij en gelijk in waardigheid en rechten geboren. Zij zijn begiftigd met verstand en geweten, en behoren zich jegens elkander in een geest van broederschap te gedragen. |
Saxon, Low | All de Minschen sünd frie un gliek an Wüürd un Rechten baren. Se hebbt Vernunft un een Geweten un se schüllt sik Bröder sien. |
Norwegian, Nynorsk | Alle menneske er fødde til fridom og med same menneskeverd og menneskerettar. Dei har fått fornuft og samvit og skal leve med kvarandre som brør. |
Norwegian, Bokmaal | Alle mennesker er født frie og med samme menneskeverd og menneskerettigheter. De er utstyrt med fornuft og samvittighet og bør handle mot hverandre i brorskapets ånd. |
Vulnerability[edit]
This table summarizes the vulnerability of various Dravidian languages. Data is derived from the ‘Atlas of the World’s Languages in Danger, © UNESCO, http://www.unesco.org/culture/languages-atlas’ and Ethnologue.
Language | ISO639-3 | Location | Speakers | Status | |
---|---|---|---|---|---|
Ethnologue | UNESCO | ||||
Frankish | frk
|
Germany | 0 | 10 (Extinct) | - |
Wymysorys | wym
|
Poland | 70 | 8b (Nearly extinct) | 3 (Severely endangered) |
Frisian, Eastern | frs
|
Germany | 5,120 | 8a (Moribund) | - |
Saterfriesisch | stq
|
Germany | 1,000 | 7 (Shifting) | 3 (Severely endangered) |
German, Colonia Tovar | gct
|
Venezuela | 1,500 | 7 (Shifting) | - |
Yiddish, Western | yih
|
Germany | 5,400 | 7 (Shifting) | - |
Frisian, Northern | frr
|
Germany | 10,000 | 7 (Shifting) | 3 (Severely endangered) |
Hunsrik | hrx
|
Brazil | 3,000,000 | 7 (Shifting) | - |
Walser | wae
|
Switzerland | 22,780 | 6b (Threatened) | - |
Vlaams | vls
|
Belgium | 1,204,000 | 6b (Threatened) | - |
Mócheno | mhn
|
Italy | 1,900 | 6a (Vigorous) | 2 (Definitely endangered) |
Cimbrian | cim
|
Italy | 2,230 | 6a (Vigorous) | 2 (Definitely endangered) |
Silesian, Lower | sli
|
Poland | 22,900 | 6a (Vigorous) | - |
Hutterisch | geh
|
Canada | 40,000 | 6a (Vigorous) | - |
Kölsch | ksh
|
Germany | 250,000 | 6a (Vigorous) | - |
Jutish | jut
|
Germany, Denmark | ? | 6a (Vigorous) | 2 (Definitely endangered) |
Pfaelzisch | pfl
|
Germany, France | ? | 6a (Vigorous) | 1 (Vulnerable) |
Saxon, Low | nds
|
Germany | 1,000 | 5 (Developing) | - |
German, Pennsylvania | pdc
|
United States | 133,000 | 5 (Developing) | - |
Zeeuws | zea
|
Netherlands | 220,000 | 5 (Developing) | - |
Plautdietsch | pdt
|
Canada & Ukraine | 394,900 | 5 (Developing) | 2 (Definitely endangered) |
Gronings | gos
|
Netherlands | 592,000 | 5 (Developing) | - |
Swabian | swg
|
Germany | 819,000 | 5 (Developing) | - |
Saxon, Upper | sxu
|
Germany | 2,000,000 | 5 (Developing) | - |
Mainfränkisch | vmf
|
Germany, Czech Republic | 4,910,000 | 5 (Developing) | 1 (Vulnerable) |
German, Swiss | gsw
|
Switzerland | 6,469,000 | 5 (Developing) | - |
Bavarian | bar
|
Germany, Austria, Hungary, Italy, Switzerland, Czech Republic | 13,259,000 | 5 (Developing) | 1 (Vulnerable) |
Achterhoeks | act
|
Netherlands | ? | 5 (Developing) | - |
Drents | drt
|
Netherlands | ? | 5 (Developing) | - |
Sallands | sdz
|
Netherlands | ? | 5 (Developing) | - |
Stellingwerfs | stl
|
Netherlands | ? | 5 (Developing) | - |
Twents | twd
|
Netherlands | ? | 5 (Developing) | - |
Veluws | vel
|
Netherlands | ? | 5 (Developing) | - |
Westphalien | wep
|
Germany | ? | 5 (Developing) | - |
Scots | sco
|
United Kingdom of Great Britain and Northern Ireland | 100,000 | 4 (Educational) | 1 (Vulnerable) |
Luxembourgeois | ltz
|
Germany, Belgium, France, Luxembourg | 320,710 | 4 (Educational) | 1 (Vulnerable) |
Yiddish, Eastern | ydd
|
Israel & Germany, Austria, Belarus, Belgium, Denmark, Estonia, Finland, France, Hungary, Italy, Latvia, Lithuania, Luxembourg, Republic of Moldova, Norway, Netherlands, Poland, Romania, United Kingdom of Great Britain and Northern Ireland, Russian Federation, Slovakia, Sweden, Switzerland, Czech Republic, Ukraine | 1,505,030 | 4 (Educational) | 2 (Definitely endangered) |
Faroese | fao
|
Denmark & Faroe Islands | 66,150 | 2 (Provincial) | 1 (Vulnerable) |
Frisian, Western | fry
|
Netherlands | 467,000 | 2 (Provincial) | 1 (Vulnerable) |
Limburgish | lim
|
Netherlands | 1,300,000 | 2 (Provincial) | - |
Icelandic | isl
|
Iceland | 243,840 | 1 (National) | - |
Norwegian | nor
|
Norway | 4,741,780 | 1 (National) | - |
Afrikaans | afr
|
South Africa | 4,949,410 | 1 (National) | - |
Danish | dan
|
Denmark | 5,592,490 | 1 (National) | - |
Swedish | swe
|
Sweden | 8,381,829 | 1 (National) | 2 (Definitely endangered) |
Dutch | nld
|
Netherlands | 22,984,690 | 1 (National) | - |
German, Standard | deu
|
Germany | 83,812,810 | 1 (National) | - |
English | eng
|
United Kingdom | 334,800,758 | 1 (National) | - |
This article uses material from the Wikipedia article "Germanic languages", which is released under the Creative Commons Attribution-Share-Alike License 3.0.