Bengali and English/Transfer Chunker

From Apertium
Jump to navigation Jump to search


1. NOM (chunk: nom)

  Pattern : noun/proper noun
  Action  : assign <p3><infml> to person
  Chunk   : nom<SN><[gen]><[nbr]><nom>
  Comment : catches the nominals
  Example : "Bangladesh" = "^nom<SN><mf><sg><nom>{^বাংলাদেশ<np><top><2><3><4>$}$"

2. ADJ NOM (chunk: adj_nom)

  Pattern : adjective proper-noun
  Action  : 
  Chunk   : adj_nom<SN><[gender]><[number]><nom>
  Comment : catches nominals with adjective
  Example : "Beautiful Bangladesh" = "^adj_nom<SN><mf><sg><nom>{^সুন্দর<adj><sint><mf>$ ^বাংলাদেশ<np><top><2><3><4>$}$"

3. ADJ (chunk: adj)

  Pattern : adjective
  Action  : 
  Chunk   : adj<SN><[adj attribute]><[gender]><nom>
  Comment : catches adjectives
  Example : "Beautiful" = "^adj<SN><adj><sint><mf>{^সুন্দর<2><3>$}$"

4. ART NOM (chunk: art_nom)

  Pattern : definitive determinant (i.e. 'The')
  Action  : assign <p3><infml> to person, when nominal is noun and singular mark as 'definitive'
  Chunk   : art_nom<SN><[gender]><[number]><nom>
  Comment : catches nominals with article ('The')
  Example : "The Sundarbans" = "^art_nom<SN><m><sg><nom>{^সুন্দরবন<np><top><2><3><4>$}$"

5. DET NOM (chunk: det_nom)

  Pattern : determinant nominal
  Action  : fix determiner tag, assign <p3><infml> to person
  Chunk   : det_nom<SN><[gender]><[number]><nom>
  Comment : catches nominals with determiners
  Example : "Our Sundarbans" = "^det_nom<SN><m><sg><nom>{^আমাদের<det><gen>$ ^সুন্দরবন<np><top><2><3><4>$}$"

6. DET ADJ NOM (chunk: det_adj_nom)

7. PRNSUBJ (chunk: prnsubj)

8. PRNREF (chunk: prnref)

9. VBSER VBLEX (chunk: vbser_vblex)

10. VBHAVER VBLEX (chunk: vbhaver_vblex)

11. VAUX VBLEX (chunk: vaux_vblex)

12. FTAUX BE VBLEX (chunk: ftaux_be_vblex)

13. VBSER PRES (chunk: vbser_pri)

14. VBSER PAST (chunk: vbser_past)

15. VERB CONJ (chunk: verbcj)

16. VBSER PRES (chunk: vbser_pres)

17. VBHAVER VBSER VBLEX (chunk: vbhaver_vbser_vblex)

18. NUM NOM (chunk: num_nom)

19. POST (chunk: post)

20. FTAUX VBHAVER VBLEX (chunk: ftaux_vbhaver_vblex)

21. FTAUX VBHAVER VBSER VBLEX (chunk: ftaux_vbhaver_vbser_vblex)

22. VBDO (chunk: vbdo)

23. VBDO VBLEX (chunk: vbdo_vblex)

24. GERUND (chunk: vbger)

25. FTAUX BE (chunk:ftaux_be)

26. VBHAVER VBSER (chunk:vbhaver_vbser)

27. VBHAVER VBSER VBLEX ADV (chunk:vbhaver_vbser_vblex_adv)

28. FTAUX BE VBLEX ADV (chunk:ftaux_be_vblex_adv)

29. VBSER VBLEX ADV (chunk:vbser_vblex_adv)

30. VBHAVER VBLEX ADV (chunk:vbhaver_vblex_adv)

31. FTAUX VBLEX ADV (chunk:ftaux_vblex_adv)

32. VERB ADV (chunk:verbcj_adv)

33. VBDO VBLEX ADV (chunk:vbdo_vblex_adv)

34. GERUND ADV (chunk:gerund_adv)

35. ADV ADJ NOM (chunk:adv_adj_nom)

36. ADV ADJ (chunk:adv_adj)

37. DET ADV ADJ NOM (chunk:det_adv_adj_nom)

38. ART ADJ NOM (chunk:art_adj_nom)


  • In the rule "SN SV SN" in t2x, case for the last SN is forced <obj>. This is not right all the time; say,
  "I eat fish" → "আমি মাছকে খাই"

this should be "আমি মাছ খাই", whereas,

  "I love you" → "আমি আপনাকে ভালবাসি"

is ok with case <obj>.