Difference between revisions of "Serbo-Croatian and Macedonian/Final report"

From Apertium
Jump to navigation Jump to search
Line 64: Line 64:


; Rules
; Rules
<code>apertium-sh-mk.sh-mk.t1x</code>: 51
* Number of rules: 51 in t1x, 11 in t2x, 1 in t3x
<code>apertium-sh-mk.sh-mk.t2x</code>: 11
<code>apertium-sh-mk.sh-mk.t3x</code>: 1


; Error rate (Realistic results for now only for <code>setimes.pilots.txt</code>, the rest is just preliminary postedited)
; Error rate (Realistic results for now only for <code>setimes.pilots.txt</code>, the rest is just preliminary postedited)

Revision as of 18:32, 25 August 2011

73.624631444 +/- 0.488418931215

Description

Statistics

Dictionaries
  • sh morphological analyser lexicon: 7564 lemmata, 170787 surface forms (including ekavian/ijekavian)
  • apertium-sh-mk.sh-mk.dix (unique: 9985, total: 13032)
Coverage
  • (bs|hr|sr|sh) Wikipedia ( , std. dev.: )
  • (sr|hr) SETimes ( , std. dev.: )
Testvoc
POS Total Clean With @ With # Clean %
adj 2072699 2072699 0 0 100
vblex 246713 246713 0 0 100
vbmod 96020 96020 0 0 100
np 54190 54190 0 0 100
n 41477 41477 0 0 100
adv 7808 7808 0 0 100
prn 7662 7662 0 0 100
num 4284 4284 0 0 100
vbhaver 224 224 0 0 100
vbser 170 170 0 0 100
pr 77 77 0 0 100
abbr 33 33 0 0 100
cnjsub 30 30 0 0 100
cnjcoo 20 20 0 0 100
vaux 0 0 0 0 100
rel 0 0 0 0 100
preadv 0 0 0 0 100
ij 0 0 0 0 100
guio 0 0 0 0 100
det 0 0 0 0 100
cnjadv 0 0 0 0 100
cm 0 0 0 0 100
Rules

apertium-sh-mk.sh-mk.t1x: 51 apertium-sh-mk.sh-mk.t2x: 11 apertium-sh-mk.sh-mk.t3x: 1

Error rate (Realistic results for now only for setimes.pilots.txt, the rest is just preliminary postedited)
File Num. Words % OOV WER (Sur) PER (Sur) WER (Lem) PER (Lem)
setimes.pilots.txt 454 0.44% 29.96% 20.48% - -
setimes.tablice.txt 466 0.43% 12.23% 9.23% - -
setimes.klupa.txt 477 18.12% 14.68% 12.37% - -
setimes.povijest.txt 519 14.18% 11.95% 9.25% - -
wikipedia.txt - - - - - -

Future work