Bengali and English/Transfer Chunker

From Apertium
Jump to navigation Jump to search

Rules

1. NOM (chunk: nom)

  Pattern : noun/proper noun
  Action  : assign <p3><infml> to person
  Chunk   : nom<SN><[gen]><[nbr]><nom>
  Comment : catches the nominals
  Example : "Bangladesh" = "^nom<SN><mf><sg><nom>{^বাংলাদেশ<np><top><2><3><4>$}$"


2. ADJ NOM (chunk: adj_nom)

  Pattern : adjective proper-noun
  Action  : 
  Chunk   : adj_nom<SN><[gender]><[number]><nom>
  Comment : catches nominals with adjective
  Example : "Beautiful Bangladesh" = "^adj_nom<SN><mf><sg><nom>{^সুন্দর<adj><sint><mf>$ ^বাংলাদেশ<np><top><2><3><4>$}$"


3. ADJ (chunk: adj)

  Pattern : adjective
  Action  : 
  Chunk   : adj<SN><[adj attribute]><[gender]><nom>
  Comment : catches adjectives
  Example : "Beautiful" = "^adj<SN><adj><sint><mf>{^সুন্দর<2><3>$}$"


4. ART NOM (chunk: art_nom)

  Pattern : definitive determinant (i.e. 'The')
  Action  : assign <p3><infml> to person, when nominal is noun and singular mark as 'definitive'
  Chunk   : art_nom<SN><[gender]><[number]><nom>
  Comment : catches nominals with article ('The')
  Example : "The Sundarbans" = "^art_nom<SN><m><sg><nom>{^সুন্দরবন<np><top><2><3><4>$}$"


5. DET NOM (chunk: det_nom)

  Pattern : determinant nominal
  Action  : fix determiner tag, assign <p3><infml> to person
  Chunk   : det_nom<SN><[gender]><[number]><nom>
  Comment : catches nominals with determiners
  Example : "Our Sundarbans" = "^det_nom<SN><m><sg><nom>{^আমাদের<det><gen>$ ^সুন্দরবন<np><top><2><3><4>$}$"


6. DET ADJ NOM (chunk: det_adj_nom)

7. PRNSUBJ (chunk: prnsubj)

8. PRNREF (chunk: prnref)

9. VBSER VBLEX (chunk: vbser_vblex)

10. VBHAVER VBLEX (chunk: vbhaver_vblex)

11. VAUX VBLEX (chunk: vaux_vblex)

12. FTAUX BE VBLEX (chunk: ftaux_be_vblex)

13. VBSER PRES (chunk: vbser_pri)

14. VBSER PAST (chunk: vbser_past)

15. VERB CONJ (chunk: verbcj)

16. VBSER PRES (chunk: vbser_pres)

17. VBHAVER VBSER VBLEX (chunk: vbhaver_vbser_vblex)

18. NUM NOM (chunk: num_nom)

19. POST (chunk: post)

20. FTAUX VBHAVER VBLEX (chunk: ftaux_vbhaver_vblex)

21. FTAUX VBHAVER VBSER VBLEX (chunk: ftaux_vbhaver_vbser_vblex)

22. VBDO (chunk: vbdo)

23. VBDO VBLEX (chunk: vbdo_vblex)

24. GERUND (chunk: vbger)

25. FTAUX BE (chunk:ftaux_be)

26. VBHAVER VBSER (chunk:vbhaver_vbser)

27. VBHAVER VBSER VBLEX ADV (chunk:vbhaver_vbser_vblex_adv)

28. FTAUX BE VBLEX ADV (chunk:ftaux_be_vblex_adv)

29. VBSER VBLEX ADV (chunk:vbser_vblex_adv)

30. VBHAVER VBLEX ADV (chunk:vbhaver_vblex_adv)

31. FTAUX VBLEX ADV (chunk:ftaux_vblex_adv)

32. VERB ADV (chunk:verbcj_adv)

33. VBDO VBLEX ADV (chunk:vbdo_vblex_adv)

34. GERUND ADV (chunk:gerund_adv)

35. ADV ADJ NOM (chunk:adv_adj_nom)

36. ADV ADJ (chunk:adv_adj)

37. DET ADV ADJ NOM (chunk:det_adv_adj_nom)

38. ART ADJ NOM (chunk:art_adj_nom)

Issues(en-bn)

  • In the rule "SN SV SN" in t2x, case for the last SN is forced <obj>. This is not right all the time; say,
  "I eat fish" → "আমি মাছকে খাই"

this should be "আমি মাছ খাই", whereas,

  "I love you" → "আমি আপনাকে ভালবাসি"

is ok with case <obj>.


  • About the future perfect tense, it should like this:
  "I shall have eaten rice" → "আমি ভাত খেয়ে থাকব"

But we don't have inflections for the verb 'খ/া' for 'েয়ে থাকব'


  • May be some tagger problem; running the following for pretransfer output with "Zaher plays football":
  echo "Zaher plays football" | apertium -d . en-bn-tagger

outputs this:

  ^Zaher<np><ant><m><sg>$ ^play<n><pl>$ ^football<n><sg>$

where 'plays' should be analyzed as verb (play<vblex><pri><p3><sg>). And for this reason, we get such outputs:

  "Zaher plays football" → "জাহের নাটকগুলো ফুটবল"