Difference between revisions of "Serbo-Croatian and Macedonian/Final report"

From Apertium
Jump to navigation Jump to search
Line 15: Line 15:
   
 
; Testvoc
 
; Testvoc
  +
Thu Aug 25 20:14:41 CEST 2011
  +
===============================================
  +
{|class=wikitable
  +
! POS !! Total !! Clean !! With @ !! With # !! Clean %
  +
|-
  +
| adj||2072699||2072699||0||0||100
  +
|-
  +
| vblex||246713||246713||0||0||100
  +
|-
  +
| vbmod||96020||96020||0||0||100
  +
|-
  +
| np||54190||54190||0||0||100
  +
|-
  +
| n||41477||41477||0||0||100
  +
|-
  +
| adv||7808||7808||0||0||100
  +
|-
  +
| prn||7662||7662||0||0||100
  +
|-
  +
| num||4284||4284||0||0||100
  +
|-
  +
| vbhaver||224||224||0||0||100
  +
|-
  +
| vbser||170||170||0||0||100
  +
|-
  +
| pr||77||77||0||0||100
  +
|-
  +
| abbr||33||33||0||0||100
  +
|-
  +
| cnjsub||30||30||0||0||100
  +
|-
  +
| cnjcoo||20||20||0||0||100
  +
|-
  +
| vaux||0||0||0||0||100
  +
|-
  +
| rel||0||0||0||0||100
  +
|-
  +
| preadv||0||0||0||0||100
  +
|-
  +
| ij||0||0||0||0||100
  +
|-
  +
| guio||0||0||0||0||100
  +
|-
  +
| det||0||0||0||0||100
  +
|-
  +
| cnjadv||0||0||0||0||100
  +
|-
  +
| cm||0||0||0||0||100
  +
|}
   
 
; Rules
 
; Rules

Revision as of 18:29, 25 August 2011

73.624631444 +/- 0.488418931215

Description

Statistics

Dictionaries
  • sh morphological analyser lexicon: 7564 lemmata, 170787 surface forms (including ekavian/ijekavian)
  • apertium-sh-mk.sh-mk.dix (unique: 9985, total: 13032)
Coverage
  • (bs|hr|sr|sh) Wikipedia ( , std. dev.: )
  • (sr|hr) SETimes ( , std. dev.: )
Testvoc

Thu Aug 25 20:14:41 CEST 2011

===================================
POS Total Clean With @ With # Clean %
adj 2072699 2072699 0 0 100
vblex 246713 246713 0 0 100
vbmod 96020 96020 0 0 100
np 54190 54190 0 0 100
n 41477 41477 0 0 100
adv 7808 7808 0 0 100
prn 7662 7662 0 0 100
num 4284 4284 0 0 100
vbhaver 224 224 0 0 100
vbser 170 170 0 0 100
pr 77 77 0 0 100
abbr 33 33 0 0 100
cnjsub 30 30 0 0 100
cnjcoo 20 20 0 0 100
vaux 0 0 0 0 100
rel 0 0 0 0 100
preadv 0 0 0 0 100
ij 0 0 0 0 100
guio 0 0 0 0 100
det 0 0 0 0 100
cnjadv 0 0 0 0 100
cm 0 0 0 0 100
Rules
  • Number of rules: 51 in t1x, 11 in t2x, 1 in t3x
Error rate (Realistic results for now only for setimes.pilots.txt, the rest is just preliminary postedited)
File Num. Words % OOV WER (Sur) PER (Sur) WER (Lem) PER (Lem)
setimes.pilots.txt 454 0.44% 29.96% 20.48% - -
setimes.tablice.txt 466 0.43% 12.23% 9.23% - -
setimes.klupa.txt 477 18.12% 14.68% 12.37% - -
setimes.povijest.txt 519 14.18% 11.95% 9.25% - -
wikipedia.txt - - - - - -

Future work