Difference between revisions of "Bengali and English/BugsAndIssues"
Jump to navigation
Jump to search
Darthxaher (talk | contribs) (→Nouns) |
Darthxaher (talk | contribs) (→Nouns) |
||
Line 3: | Line 3: | ||
== Nouns == |
== Nouns == |
||
# Only 800 tagged pure nouns from anubadok dictionary |
# Only 800 tagged pure nouns from anubadok dictionary matched against CRBLP's 20K most freq used word list |
||
:* need to tag more manually (en-es package has 5K approx. need to reach there) |
:* need to tag more manually (en-es package has 5K approx. need to reach there) |
||
:* Anubadok has about 2000 Nouns in its own list |
|||
# Some nouns are always pl or sg, need to tag those |
# Some nouns are always pl or sg, need to tag those |
||
# We are excluding Proper nouns now |
# We are excluding Proper nouns now |
Revision as of 01:36, 25 June 2009
Contents |
Nouns
- Only 800 tagged pure nouns from anubadok dictionary matched against CRBLP's 20K most freq used word list
- need to tag more manually (en-es package has 5K approx. need to reach there)
- Anubadok has about 2000 Nouns in its own list
- Some nouns are always pl or sg, need to tag those
- We are excluding Proper nouns now
- We are excluding adjectives that can be used as nouns, right now
- We are keeping track the plural form generation through animacy, this is good, but in the long run need to come up with something more sophisticated
- Some nouns can have hybrid animacy, need to tag those later
- Should we tag the subtype of Noun?