Difference between revisions of "Bengali and English/BugsAndIssues"

From Apertium
Jump to navigation Jump to search
Line 13: Line 13:
# Some nouns can have hybrid animacy, need to tag those later
# Some nouns can have hybrid animacy, need to tag those later
# Should we tag the subtype of Noun?
# Should we tag the subtype of Noun?
# মা - মারা , জনক - জনকরা - These are wrong, need to add rule to fix that


== Pronouns ==
== Pronouns ==

Revision as of 05:02, 25 June 2009

Nouns

  1. Only 800 tagged pure nouns from anubadok dictionary matched against CRBLP's 20K most freq used word list
  • need to tag more manually (en-es package has 5K approx. need to reach there)
  • Anubadok has about 2000 Nouns in its own list
  • Anubadok has about 2300 Proper Nouns in its own list
  1. Some nouns are always pl or sg, need to tag those
  2. We are excluding Proper nouns now
  3. We are excluding adjectives that can be used as nouns, right now
  4. We are keeping track the plural form generation through animacy, this is good, but in the long run need to come up with something more sophisticated
  5. Some nouns can have hybrid animacy, need to tag those later
  6. Should we tag the subtype of Noun?
  7. মা - মারা , জনক - জনকরা - These are wrong, need to add rule to fix that

Pronouns

Adjective

Verb

Adverb

Determiner