Difference between revisions of "Serbo-Croatian and Macedonian/Final report"
Jump to navigation
Jump to search
Line 17: | Line 17: | ||
; Rules |
; Rules |
||
* Number of rules: 51 in t1x, 11 in t2x, 1 in t3x |
|||
; Error rate (Realistic results for now only for <code>setimes.pilots.txt</code>, the rest is just preliminary postedited) |
; Error rate (Realistic results for now only for <code>setimes.pilots.txt</code>, the rest is just preliminary postedited) |
Revision as of 17:55, 25 August 2011
73.624631444 +/- 0.488418931215
Description
Statistics
- Dictionaries
- sh morphological analyser lexicon: 7564 lemmata, 170787 surface forms (including ekavian/ijekavian)
apertium-sh-mk.sh-mk.dix
(unique: , total: )
- Coverage
- (bs|hr|sr|sh) Wikipedia ( , std. dev.: )
- (sr|hr) SETimes ( , std. dev.: )
- Testvoc
- Rules
- Number of rules: 51 in t1x, 11 in t2x, 1 in t3x
- Error rate (Realistic results for now only for
setimes.pilots.txt
, the rest is just preliminary postedited)
File | Num. Words | % OOV | WER (Sur) | PER (Sur) | WER (Lem) | PER (Lem) |
---|---|---|---|---|---|---|
setimes.pilots.txt |
454 | 0.44% | 29.96% | 20.48% | - | - |
setimes.tablice.txt |
466 | 0.43% | 12.23% | 9.23% | - | - |
setimes.klupa.txt |
477 | 18.12% | 14.68% | 12.37% | - | - |
setimes.povijest.txt |
519 | 14.18% | 11.95% | 9.25% | - | - |
wikipedia.txt |
- | - | - | - | - | - |