Hindi and Bengali for GSoC

This is a language pair translating between Hindi and Bengali.


Currently the translator is very basic. We need to increase it's coverage to cover more words of the languages. We also need to add more transfer rules to cover all the Pending Tests to get more accurate translations.


  • Closed Categories (n, adj, vblex, vbser, adv, prn, post, cnjcoo, cnjsub, cnjadv, det, num, prn, ord).
  • Most frequently used nouns, post, adj, adv, det added.
  • Hin > Ben transfer rules on nouns, verbs tenses and adj added.
  • Testing scripts and test corpus.

Todo list

  • Increase coverage of translator by adding more nouns, adjectives and verbs from the list of most frequently used words in corpus. Reference
  • Add transfer rules to fix pronoun #s (obj -> obl , nom -> nom, erg conversion).
  • Write transfer rules for Pending Tests (Ben > Hin and Hin > Ben).
  • Remove prox and dist tag in the bidix and replace it by making suitable paradigms for det.prox & det.dist (ইটা / ওটা).
  • Do disambiguation.
  • Reduce Word Error Rate.

