Difference between revisions of "Scandinavian MT project"
Jump to navigation
Jump to search
(→Status) |
|||
(14 intermediate revisions by 2 users not shown) | |||
Line 3: | Line 3: | ||
* https://meta.wikimedia.org/wiki/Grants:IEG/Pan-Scandinavian_Machine-assisted_Content_Translation |
* https://meta.wikimedia.org/wiki/Grants:IEG/Pan-Scandinavian_Machine-assisted_Content_Translation |
||
* https://meta.wikimedia.org/wiki/Skanwiki/Tinget |
* https://meta.wikimedia.org/wiki/Skanwiki/Tinget / https://meta.wikimedia.org/wiki/Skanwiki/Skanwikiprojekt_MT |
||
* http://blog.wikimedia.org/2016/06/01/scandinavian-wikipedias-content-translation/ / http://blogg.wikimedia.no/2015/12/14/verktoy-for-tverrskandinavisk-maskinomsetjing/ |
|||
* [[North Germanic languages]] |
* [[North Germanic languages]] |
||
Line 9: | Line 11: | ||
==Status== |
==Status== |
||
Note: UDHR was probably used during development on on nno/nob stuff. |
|||
{|class=wikitable |
{|class=wikitable |
||
! Direction !! |
! Direction !! Pair !! Bidix !! Coverage<br/>UDHR – Wiki !! Testvoc !! Release date !! Released !! WER (-u) |
||
|- |
|- |
||
| nob-nno || [[apertium-nno-nob|nno-nob]] || |
| nob-nno || [[apertium-nno-nob|nno-nob]] ||69,397 || 99.2% – 92.6% || || ||✓ || 10.71 %<ref>[[Apertium-nno-nob#WER-test_28.2F8_2009]]</ref> |
||
|- |
|- |
||
| nno-nob || [[apertium-nno-nob|nno-nob]] ||69, |
| nno-nob || [[apertium-nno-nob|nno-nob]] ||69,397 || 98.9% – 90.6% || || || ✓ || |
||
|- |
|- |
||
| dan-nob || [[apertium-dan-nor|dan-nor]] || |
| dan-nob || [[apertium-dan-nor|dan-nor]] ||53,746 || 95.9% – 88.1% || ✓ || {{sc|1 februar 2016}} || ✓ || 10.87 %<ref>https://svn.code.sf.net/p/apertium/svn/trunk/apertium-dan-nor/WER/</ref> |
||
|- |
|- |
||
| dan-nno || [[apertium-dan-nor|dan-nor]] || |
| dan-nno || [[apertium-dan-nor|dan-nor]] ||53,746 || 92.7% – 87.3% || ✓ || {{sc|1 januar 2016}} || ✓ || 13.64 %, 22.64 %<ref>https://svn.code.sf.net/p/apertium/svn/trunk/apertium-dan-nor/WER/</ref> |
||
|- |
|- |
||
| nob-dan || [[apertium-dan-nor|dan-nor]] || |
| nob-dan || [[apertium-dan-nor|dan-nor]] ||53,746 || 98.6% – 91.7% || ✓ || ||✓ || |
||
|- |
|- |
||
| nno-dan || [[apertium-dan-nor|dan-nor]] || |
| nno-dan || [[apertium-dan-nor|dan-nor]] ||53,746 || 97.4% – 89.8% || ✓ || {{sc|1 april 2016}} || ✓ || |
||
|- |
|- |
||
| swe-nob || [[apertium-swe-nor|swe-nor]] || |
| swe-nob || [[apertium-swe-nor|swe-nor]] || 8,920 || 94.8% – 87.3% || || {{sc|17 mai 2016}} || ✓ || |
||
|- |
|- |
||
| swe-nno || [[apertium-swe-nor|swe-nor]] || |
| swe-nno || [[apertium-swe-nor|swe-nor]] || 8,920 || 93.8% – 87.0% || || {{sc|17 mai 2016}} || ✓ || |
||
|- |
|- |
||
| nob-swe || [[apertium-swe-nor|swe-nor]] || |
| nob-swe || [[apertium-swe-nor|swe-nor]] || 8,920 || 97.1% – 89.7% || || {{sc|17 mai 2016}} || ✓ || |
||
|- |
|- |
||
| nno-swe || [[apertium-swe-nor|swe-nor]] || |
| nno-swe || [[apertium-swe-nor|swe-nor]] || 8,920 || 96.6% – 87.4% || || {{sc|17 mai 2016}} || ✓ || |
||
|- |
|- |
||
| swe-dan || [[apertium-swe-dan|swe-dan]] || |
| swe-dan || [[apertium-swe-dan|swe-dan]] || 17,551 || 88.0% – 83.7% || || {{sc|1 mars 2016}} || ✓ || 31 %<ref>[[Apertium-swe-dan#Dansk_.28english_version_below.29]]</ref> |
||
|- |
|- |
||
| dan-swe || [[apertium-swe-dan|swe-dan]] || |
| dan-swe || [[apertium-swe-dan|swe-dan]] ||17,551 || 90.4% – 82.9% || || {{sc|1 mars 2016}} || ✓ || |
||
|- |
|- |
||
|} |
|} |
||
===Stats from stemcounterbot=== |
===Stats from stemcounterbot=== |
||
(bidix stems buggy for swe-pairs) |
|||
{|class=wikitable |
{|class=wikitable |
||
! Pair !! Bidix !! t1x rules |
! Pair !! Bidix !! t1x rules !! lrx rules |
||
|- |
|- |
||
| [[apertium-nno-nob/stats|nno-nob]] || {{#lst:Apertium-nno-nob/stats|nno-nob_stems}} || nno→nob: {{#lst:Apertium-nno-nob/stats|nno-nob_t1x_rules}}, nob→nno: {{#lst:Apertium-nno-nob/stats|nob-nno_t1x_rules}} |
| [[apertium-nno-nob/stats|nno-nob]] || {{#lst:Apertium-nno-nob/stats|nno-nob_stems}} || nno→nob: {{#lst:Apertium-nno-nob/stats|nno-nob_t1x_rules}}, nob→nno: {{#lst:Apertium-nno-nob/stats|nob-nno_t1x_rules}} || nno→nob: {{#lst:Apertium-nno-nob/stats|nno-nob_lrx_rules}}, nob→nno: {{#lst:Apertium-nno-nob/stats|nob-nno_lrx_rules}} |
||
|- |
|- |
||
| [[apertium-dan-nor/stats|dan-nor]] || {{#lst:Apertium-dan-nor/stats|dan-nor_stems}} || dan→nno: {{#lst:Apertium-dan-nor/stats|dan-nno_t1x_rules}}, dan→nob: {{#lst:Apertium-dan-nor/stats|dan-nob_t1x_rules}}, nor→dan: {{#lst:Apertium-dan-nor/stats|nor-dan_t1x_rules}} |
| [[apertium-dan-nor/stats|dan-nor]] || {{#lst:Apertium-dan-nor/stats|dan-nor_stems}} || dan→nno: {{#lst:Apertium-dan-nor/stats|dan-nno_t1x_rules}}, dan→nob: {{#lst:Apertium-dan-nor/stats|dan-nob_t1x_rules}}, nor→dan: {{#lst:Apertium-dan-nor/stats|nor-dan_t1x_rules}}|| dan→nno: {{#lst:Apertium-dan-nor/stats|dan-nno_lrx_rules}}, dan→nob: {{#lst:Apertium-dan-nor/stats|dan-nob_lrx_rules}}, nor→dan: {{#lst:Apertium-dan-nor/stats|nor-dan_lrx_rules}} |
||
|- |
|- |
||
| [[apertium-swe-nor/stats|swe-nor]] || {{#lst:Apertium-swe-nor/stats|swe-nor_stems}} || swe→nno: {{#lst:Apertium-swe-nor/stats|swe-nno_t1x_rules}}, swe→nob: {{#lst:Apertium-swe-nor/stats|swe-nob_t1x_rules}}, nor→swe: {{#lst:Apertium-swe-nor/stats|nor-swe_t1x_rules}} |
| [[apertium-swe-nor/stats|swe-nor]] || {{#lst:Apertium-swe-nor/stats|swe-nor_stems}} || swe→nno: {{#lst:Apertium-swe-nor/stats|swe-nno_t1x_rules}}, swe→nob: {{#lst:Apertium-swe-nor/stats|swe-nob_t1x_rules}}, nor→swe: {{#lst:Apertium-swe-nor/stats|nor-swe_t1x_rules}}|| swe→nno: {{#lst:Apertium-swe-nor/stats|swe-nno_lrx_rules}}, swe→nob: {{#lst:Apertium-swe-nor/stats|swe-nob_lrx_rules}}, nor→swe: {{#lst:Apertium-swe-nor/stats|nor-swe_lrx_rules}} |
||
|- |
|- |
||
| [[apertium-swe-dan/stats|swe-dan]] || {{#lst:Apertium-swe-dan/stats|swe-dan_stems}} || swe→dan: {{#lst:Apertium-swe-dan/stats|swe-dan_t1x_rules}}, dan→swe: {{#lst:Apertium-swe-dan/stats|dan-swe_t1x_rules}} |
| [[apertium-swe-dan/stats|swe-dan]] || {{#lst:Apertium-swe-dan/stats|swe-dan_stems}} || swe→dan: {{#lst:Apertium-swe-dan/stats|swe-dan_t1x_rules}}, dan→swe: {{#lst:Apertium-swe-dan/stats|dan-swe_t1x_rules}}|| swe→dan: {{#lst:Apertium-swe-dan/stats|swe-dan_lrx_rules}}, dan→swe: {{#lst:Apertium-swe-dan/stats|dan-swe_lrx_rules}} |
||
|- |
|- |
||
|} |
|} |
||
==Evaluation== |
|||
* [[Swedish]]: [https://svn.code.sf.net/p/apertium/svn/languages/apertium-swe/texts/kalmar.txt kalmar.txt] (1119 tokens) |
|||
* [[Danish]]: [https://svn.code.sf.net/p/apertium/svn/languages/apertium-dan/texts/venedig.txt venedig.txt] (1363 tokens) |
|||
* [[Norwegian Nynorsk]]: |
|||
* [[Norwegian Bokmål]]: |
|||
==Notes== |
==Notes== |
Latest revision as of 07:02, 28 June 2016
- https://meta.wikimedia.org/wiki/Skanwiki/Tinget / https://meta.wikimedia.org/wiki/Skanwiki/Skanwikiprojekt_MT
- http://blog.wikimedia.org/2016/06/01/scandinavian-wikipedias-content-translation/ / http://blogg.wikimedia.no/2015/12/14/verktoy-for-tverrskandinavisk-maskinomsetjing/
Status[edit]
Note: UDHR was probably used during development on on nno/nob stuff.
Direction | Pair | Bidix | Coverage UDHR – Wiki |
Testvoc | Release date | Released | WER (-u) |
---|---|---|---|---|---|---|---|
nob-nno | nno-nob | 69,397 | 99.2% – 92.6% | ✓ | 10.71 %[1] | ||
nno-nob | nno-nob | 69,397 | 98.9% – 90.6% | ✓ | |||
dan-nob | dan-nor | 53,746 | 95.9% – 88.1% | ✓ | 1 februar 2016 | ✓ | 10.87 %[2] |
dan-nno | dan-nor | 53,746 | 92.7% – 87.3% | ✓ | 1 januar 2016 | ✓ | 13.64 %, 22.64 %[3] |
nob-dan | dan-nor | 53,746 | 98.6% – 91.7% | ✓ | ✓ | ||
nno-dan | dan-nor | 53,746 | 97.4% – 89.8% | ✓ | 1 april 2016 | ✓ | |
swe-nob | swe-nor | 8,920 | 94.8% – 87.3% | 17 mai 2016 | ✓ | ||
swe-nno | swe-nor | 8,920 | 93.8% – 87.0% | 17 mai 2016 | ✓ | ||
nob-swe | swe-nor | 8,920 | 97.1% – 89.7% | 17 mai 2016 | ✓ | ||
nno-swe | swe-nor | 8,920 | 96.6% – 87.4% | 17 mai 2016 | ✓ | ||
swe-dan | swe-dan | 17,551 | 88.0% – 83.7% | 1 mars 2016 | ✓ | 31 %[4] | |
dan-swe | swe-dan | 17,551 | 90.4% – 82.9% | 1 mars 2016 | ✓ |
Stats from stemcounterbot[edit]
Pair | Bidix | t1x rules | lrx rules |
---|---|---|---|
nno-nob | 69,479 | nno→nob: 22, nob→nno: 85 | nno→nob: , nob→nno: |
dan-nor | 54,742 | dan→nno: 31, dan→nob: 24, nor→dan: 22 | dan→nno: , dan→nob: , nor→dan: |
swe-nor | 66,856 | swe→nno: 30, swe→nob: 26, nor→swe: 27 | swe→nno: , swe→nob: , nor→swe: |
swe-dan | 21,791 | swe→dan: 24, dan→swe: 26 | swe→dan: , dan→swe: |
Evaluation[edit]
- Swedish: kalmar.txt (1119 tokens)
- Danish: venedig.txt (1363 tokens)
- Norwegian Nynorsk:
- Norwegian Bokmål: