Difference between revisions of "Serbo-Croatian and Macedonian/Final report"

From Apertium
Jump to navigation Jump to search
Line 17: Line 17:
   
 
; Rules
 
; Rules
  +
* Number of rules: 51 in t1x, 11 in t2x, 1 in t3x
   
 
; Error rate (Realistic results for now only for <code>setimes.pilots.txt</code>, the rest is just preliminary postedited)
 
; Error rate (Realistic results for now only for <code>setimes.pilots.txt</code>, the rest is just preliminary postedited)

Revision as of 17:55, 25 August 2011

73.624631444 +/- 0.488418931215

Description

Statistics

Dictionaries
  • sh morphological analyser lexicon: 7564 lemmata, 170787 surface forms (including ekavian/ijekavian)
  • apertium-sh-mk.sh-mk.dix (unique: , total: )
Coverage
  • (bs|hr|sr|sh) Wikipedia ( , std. dev.: )
  • (sr|hr) SETimes ( , std. dev.: )
Testvoc
Rules
  • Number of rules: 51 in t1x, 11 in t2x, 1 in t3x
Error rate (Realistic results for now only for setimes.pilots.txt, the rest is just preliminary postedited)
File Num. Words % OOV WER (Sur) PER (Sur) WER (Lem) PER (Lem)
setimes.pilots.txt 454 0.44% 29.96% 20.48% - -
setimes.tablice.txt 466 0.43% 12.23% 9.23% - -
setimes.klupa.txt 477 18.12% 14.68% 12.37% - -
setimes.povijest.txt 519 14.18% 11.95% 9.25% - -
wikipedia.txt - - - - - -

Future work