Difference between revisions of "Bengali and English/Final report"
Jump to navigation
Jump to search
(→Bidix) |
(→Bidix) |
||
Line 7: | Line 7: | ||
===Bidix=== |
===Bidix=== |
||
The bidix currently consist of 7446 entries. Though this is |
The bidix currently consist of 7446 entries. Though this is quite a big number with respect to the monodix, some common usual words are still not there. But it will be covered gradually with the work of transfer rules. And the English lemmas provided in the bidix are not all exist in the English monodix. That also need to be updated. Currently there are 3444 nouns, 1686 proper nouns, 1384 adjectives, 1243 other lemmas. |
||
===Transfer rules=== |
===Transfer rules=== |
Revision as of 14:39, 26 August 2011
Description
Monodix
The Bengali monodix is now in quite a good state with around 80% coverage of Bengali wiki. There are about 8230 lemmas among which 3594 are nouns, 1766 proper nouns, 1620 adjectives, 473 adverbs and 777 other lemmas. We are looking forward to increase the coverage alongside completing the transfer rules.
Bidix
The bidix currently consist of 7446 entries. Though this is quite a big number with respect to the monodix, some common usual words are still not there. But it will be covered gradually with the work of transfer rules. And the English lemmas provided in the bidix are not all exist in the English monodix. That also need to be updated. Currently there are 3444 nouns, 1686 proper nouns, 1384 adjectives, 1243 other lemmas.
Transfer rules
Statistics
- Dictionaries
apertium-bn-en.bn.dix
: 8,230apertium-bn-en.en-bn.dix
:7,495
- Coverage
- Bengali Wikipedia: 80.59% +/- 1.7878%
- Prothom Alo:
- Rules
apertium-bn-en.en-bn.t1x
: 42apertium-bn-en.en-bn.t2x
: 16apertium-bn-en.en-bn.t3x
: 2
- Error rate
File | Num. Words | % OOV | WER (Sur) | PER (Sur) | WER (Lem) | PER (Lem) |
---|---|---|---|---|---|---|
prothom-alo |
- | - |