Difference between revisions of "Serbo-Croatian and Macedonian/Final report"
Jump to navigation
Jump to search
Line 7: | Line 7: | ||
* sh morphological analyser lexicon: 7564 lemmata, 170787 surface forms (including ekavian/ijekavian) |
* sh morphological analyser lexicon: 7564 lemmata, 170787 surface forms (including ekavian/ijekavian) |
||
* <code>apertium-sh-mk.sh-mk.dix</code> (unique: , total: ) |
* <code>apertium-sh-mk.sh-mk.dix</code> (unique: 9985, total: 13032) |
||
; Coverage |
; Coverage |
Revision as of 17:58, 25 August 2011
73.624631444 +/- 0.488418931215
Description
Statistics
- Dictionaries
- sh morphological analyser lexicon: 7564 lemmata, 170787 surface forms (including ekavian/ijekavian)
apertium-sh-mk.sh-mk.dix
(unique: 9985, total: 13032)
- Coverage
- (bs|hr|sr|sh) Wikipedia ( , std. dev.: )
- (sr|hr) SETimes ( , std. dev.: )
- Testvoc
- Rules
- Number of rules: 51 in t1x, 11 in t2x, 1 in t3x
- Error rate (Realistic results for now only for
setimes.pilots.txt
, the rest is just preliminary postedited)
File | Num. Words | % OOV | WER (Sur) | PER (Sur) | WER (Lem) | PER (Lem) |
---|---|---|---|---|---|---|
setimes.pilots.txt |
454 | 0.44% | 29.96% | 20.48% | - | - |
setimes.tablice.txt |
466 | 0.43% | 12.23% | 9.23% | - | - |
setimes.klupa.txt |
477 | 18.12% | 14.68% | 12.37% | - | - |
setimes.povijest.txt |
519 | 14.18% | 11.95% | 9.25% | - | - |
wikipedia.txt |
- | - | - | - | - | - |