Serbo-Croatian and Macedonian/Final report
Jump to navigation
Jump to search
Description
Statistics
- Dictionaries
- sh morphological analyser lexicon: 7564 lemmata, 170787 surface forms (including ekavian/ijekavian)
apertium-sh-mk.sh-mk.dix
(unique: 9985, total: 13032)
- Coverage
- (bs|hr|sr|sh) Wikipedia ( , std. dev.: )
- (sr|hr) SETimes ( 73.624631444, std. dev.: 0.488418931215)
- Testvoc
POS | Total | Clean | With @ | With # | Clean % |
---|---|---|---|---|---|
adj | 2072699 | 2072699 | 0 | 0 | 100 |
vblex | 246713 | 246713 | 0 | 0 | 100 |
vbmod | 96020 | 96020 | 0 | 0 | 100 |
np | 54190 | 54190 | 0 | 0 | 100 |
n | 41477 | 41477 | 0 | 0 | 100 |
adv | 7808 | 7808 | 0 | 0 | 100 |
prn | 7662 | 7662 | 0 | 0 | 100 |
num | 4284 | 4284 | 0 | 0 | 100 |
vbhaver | 224 | 224 | 0 | 0 | 100 |
vbser | 170 | 170 | 0 | 0 | 100 |
pr | 77 | 77 | 0 | 0 | 100 |
abbr | 33 | 33 | 0 | 0 | 100 |
cnjsub | 30 | 30 | 0 | 0 | 100 |
cnjcoo | 20 | 20 | 0 | 0 | 100 |
vaux | 0 | 0 | 0 | 0 | 100 |
rel | 0 | 0 | 0 | 0 | 100 |
preadv | 0 | 0 | 0 | 0 | 100 |
ij | 0 | 0 | 0 | 0 | 100 |
guio | 0 | 0 | 0 | 0 | 100 |
det | 0 | 0 | 0 | 0 | 100 |
cnjadv | 0 | 0 | 0 | 0 | 100 |
cm | 0 | 0 | 0 | 0 | 100 |
- Rules
apertium-sh-mk.sh-mk.t1x
: 51
apertium-sh-mk.sh-mk.t2x
: 11
apertium-sh-mk.sh-mk.t3x
: 1
- Error rate (Realistic results for now only for
setimes.pilots.txt
, the rest is just preliminary postedited)
File | Num. Words | % OOV | WER (Sur) | PER (Sur) | WER (Lem) | PER (Lem) |
---|---|---|---|---|---|---|
setimes.pilots.txt |
454 | 0.44% | 29.96% | 20.48% | - | - |
setimes.tablice.txt |
466 | 0.43% | 12.23% | 9.23% | - | - |
setimes.klupa.txt |
477 | 18.12% | 14.68% | 12.37% | - | - |
setimes.povijest.txt |
519 | 14.18% | 11.95% | 9.25% | - | - |
wikipedia.txt |
- | - | - | - | - | - |