Difference between revisions of "Hindi and Bengali"

From Apertium
Jump to navigation Jump to search
Line 1: Line 1:


=Hindi and Bengali for GSoC=
=Hindi and Bengali for GSoC=

This is a language pair translating between [[Hindi]] and [[Bengali]].

==Goals==

Currently the translator is very basic. We need to increase it's coverage to cover more words of the languages. We also need to add more transfer rules to cover all the [https://wiki.apertium.org/wiki/Hindi_and_Bengali/Pending-Tests Pending Tests] to get more accurate translations.


==Done==
==Done==
* Closed Categories (n, adj, vblex, vbser, adv, prn, post, cnjcoo, cnjsub, cnjadv, det, num, prn, ord).
* <s>Closed Categories (n, adj, vblex, vbser, adv, prn, post, cnjcoo, cnjsub, cnjadv, det, num, prn, ord).</s>
* nouns, post, adj, adv, det from hitparade list.
* <s>Most frequently used nouns, post, adj, adv, det added.</s>
* Hin > Ben transfer rules on nouns, verbs tenses and adj.
* <s>Hin > Ben transfer rules on nouns, verbs tenses and adj added.</s>
* Testing scripts and test corpus.
* <s>Testing scripts and test corpus.</s>


==Todo list==
==Todo list==
* Add more words for nouns, adjectives and verbs from hitparade list.
* Increase coverage of translator by adding more nouns, adjectives and verbs from the list of most frequently used words in corpus. [https://wiki.apertium.org/wiki/Building_dictionaries Reference]
* Add transfer rules to fix pronoun #s (obj -> obl , nom -> nom, erg conversion)
* Add transfer rules to fix pronoun #s (obj -> obl , nom -> nom, erg conversion).
* Transfer rules for [https://wiki.apertium.org/wiki/Hindi_and_Bengali/Pending-Tests Pending Tests for Apertium-ben-hin] (Ben > Hin and Hin > Ben).
* Write transfer rules for [https://wiki.apertium.org/wiki/Hindi_and_Bengali/Pending-Tests Pending Tests] (Ben > Hin and Hin > Ben).
* Lift prox and dist tag via making a suitable paradigm for det (ইটা / ওটা)
* Remove prox and dist tag in the bidix and replace it by making suitable paradigms for det.prox & det.dist (ইটা / ওটা).
* Do disambiguation.
* Reduce Word Error Rate.


==Apertium Git Repositories==
==Apertium Git Repositories==
Line 18: Line 26:
*[https://github.com/apertium/apertium-hin apertium-hin]
*[https://github.com/apertium/apertium-hin apertium-hin]
*[https://github.com/apertium/apertium-ben apertium-ben]
*[https://github.com/apertium/apertium-ben apertium-ben]
*[https://github.com/apertium/apertium-eng-hin apertium-eng-hin]


==External Resources==
==External Resources==
Line 49: Line 56:
* [[Bengali]]
* [[Bengali]]
* [[Hindi]]
* [[Hindi]]
* [[Hindi and English]]


[[Category:Hindi and Bengali]]
[[Category:Hindi and Bengali]]

Revision as of 16:41, 24 August 2021

Hindi and Bengali for GSoC

This is a language pair translating between Hindi and Bengali.

Goals

Currently the translator is very basic. We need to increase it's coverage to cover more words of the languages. We also need to add more transfer rules to cover all the Pending Tests to get more accurate translations.

Done

  • Closed Categories (n, adj, vblex, vbser, adv, prn, post, cnjcoo, cnjsub, cnjadv, det, num, prn, ord).
  • Most frequently used nouns, post, adj, adv, det added.
  • Hin > Ben transfer rules on nouns, verbs tenses and adj added.
  • Testing scripts and test corpus.

Todo list

  • Increase coverage of translator by adding more nouns, adjectives and verbs from the list of most frequently used words in corpus. Reference
  • Add transfer rules to fix pronoun #s (obj -> obl , nom -> nom, erg conversion).
  • Write transfer rules for Pending Tests (Ben > Hin and Hin > Ben).
  • Remove prox and dist tag in the bidix and replace it by making suitable paradigms for det.prox & det.dist (ইটা / ওটা).
  • Do disambiguation.
  • Reduce Word Error Rate.

Apertium Git Repositories

External Resources

General

Dictionaries

Corpora


See also