Difference between revisions of "Bengali and English/BugsAndIssues"

From Apertium
Jump to navigation Jump to search
Line 3: Line 3:
== Nouns ==
== Nouns ==


# Only 800 tagged pure nouns from anubadok dictionary
# Only 800 tagged pure nouns from anubadok dictionary matched against CRBLP's 20K most freq used word list
:* need to tag more manually (en-es package has 5K approx. need to reach there)
:* need to tag more manually (en-es package has 5K approx. need to reach there)
:* Anubadok has about 2000 Nouns in its own list
# Some nouns are always pl or sg, need to tag those
# Some nouns are always pl or sg, need to tag those
# We are excluding Proper nouns now
# We are excluding Proper nouns now

Revision as of 01:36, 25 June 2009

Nouns

  1. Only 800 tagged pure nouns from anubadok dictionary matched against CRBLP's 20K most freq used word list
  • need to tag more manually (en-es package has 5K approx. need to reach there)
  • Anubadok has about 2000 Nouns in its own list
  1. Some nouns are always pl or sg, need to tag those
  2. We are excluding Proper nouns now
  3. We are excluding adjectives that can be used as nouns, right now
  4. We are keeping track the plural form generation through animacy, this is good, but in the long run need to come up with something more sophisticated
  5. Some nouns can have hybrid animacy, need to tag those later
  6. Should we tag the subtype of Noun?

Pronouns

Adjective

Verb

Adverb

Determiner