Difference between revisions of "Serbo-Croatian and Macedonian/Final report"
Jump to navigation
Jump to search
Line 2: | Line 2: | ||
==Description== |
==Description== |
||
note: this is still just a sketch |
|||
==Statistics== |
==Statistics== |
Revision as of 17:54, 25 August 2011
73.624631444 +/- 0.488418931215
Description
Statistics
- Dictionaries
- sh morphological analyser lexicon: 7564 lemmata, 170787 surface forms (including ekavian/ijekavian)
apertium-sh-mk.sh-mk.dix
(unique: , total: )
- Coverage
- (bs|hr|sr|sh) Wikipedia ( , std. dev.: )
- (sr|hr) SETimes ( , std. dev.: )
- Testvoc
- Rules
- Error rate (Realistic results for now only for
setimes.pilots.txt
, the rest is just preliminary postedited)
File | Num. Words | % OOV | WER (Sur) | PER (Sur) | WER (Lem) | PER (Lem) |
---|---|---|---|---|---|---|
setimes.pilots.txt |
454 | 0.44% | 29.96% | 20.48% | - | - |
setimes.tablice.txt |
466 | 0.43% | 12.23% | 9.23% | - | - |
setimes.klupa.txt |
477 | 18.12% | 14.68% | 12.37% | - | - |
setimes.povijest.txt |
519 | 14.18% | 11.95% | 9.25% | - | - |
wikipedia.txt |
- | - | - | - | - | - |