Difference between revisions of "Talk:Scandinavian MT project"
Jump to navigation
Jump to search
Line 21: | Line 21: | ||
| dan-nob || || |
| dan-nob || || |
||
|- |
|- |
||
| dan-nno || || |
| dan-nno || 84.5% || |
||
|- |
|- |
||
| nno-dan || 87.6% || 89.0% |
| nno-dan || 87.6% || 89.0% |
||
Line 29: | Line 29: | ||
| dan-swe || 80.6% || 83.0% |
| dan-swe || 80.6% || 83.0% |
||
|- |
|- |
||
| swe-dan || || 83.8% |
| swe-dan || 80.4 || 83.8% |
||
|- |
|- |
||
| swe-nno || || |
| swe-nno || || |
Revision as of 09:10, 8 March 2016
Coverage on Wikipedia dumps ("w/o cmp" is with decompounding turned off, ie. without the -e switch to lt-proc).
E.g.
bzcat ~/corpora/nnclean2.txt.bz2 \ |tr ' ' '\n' \ |grep -m5113060 . \ |apertium-deshtml \ |lt-proc nno-dan.automorf.bin \ |apertium-cleanstream -n \ |awk 'BEGIN{OFS=FS="\t"} /^\^/{lu++} /\/\*/{u++} END{print "unk","known","tot","cov %";print u,lu-u,lu,100*(lu-u)/lu}'
Direction | w/o cmp | regular |
---|---|---|
nob-nno | ||
nno-nob | ||
dan-nob | ||
dan-nno | 84.5% | |
nno-dan | 87.6% | 89.0% |
nob-dan | 89.7% | 91.4% |
dan-swe | 80.6% | 83.0% |
swe-dan | 80.4 | 83.8% |
swe-nno | ||
swe-nob | ||
nno-swe | ||
nob-swe |