Bengali and English/Transfer Chunker
Rules
1. NOM (chunk: nom)
Pattern : noun/proper noun Action : assign <p3><infml> to person Chunk : nom<SN><[gen]><[nbr]><nom> Comment : catches the nominals Example : "Bangladesh" = "^nom<SN><mf><sg><nom>{^বাংলাদেশ<np><top><2><3><4>$}$"
2. ADJ NOM (chunk: adj_nom)
Pattern : adjective proper-noun Action : Chunk : adj_nom<SN><[gender]><[number]><nom> Comment : catches nominals with adjective Example : "Beautiful Bangladesh" = "^adj_nom<SN><mf><sg><nom>{^সুন্দর<adj><sint><mf>$ ^বাংলাদেশ<np><top><2><3><4>$}$"
3. ADJ (chunk: adj)
Pattern : adjective Action : Chunk : adj<SN><[adj attribute]><[gender]><nom> Comment : catches adjectives Example : "Beautiful" = "^adj<SN><adj><sint><mf>{^সুন্দর<2><3>$}$"
4. ART NOM (chunk: art_nom)
Pattern : definitive determinant (i.e. 'The') Action : assign <p3><infml> to person, when nominal is noun and singular mark as 'definitive' Chunk : art_nom<SN><[gender]><[number]><nom> Comment : catches nominals with article ('The') Example : "The Sundarbans" = "^art_nom<SN><m><sg><nom>{^সুন্দরবন<np><top><2><3><4>$}$"
5. DET NOM (chunk: det_nom)
Pattern : determinant nominal Action : fix determiner tag, assign <p3><infml> to person Chunk : det_nom<SN><[gender]><[number]><nom> Comment : catches nominals with determiners Example : "Our Sundarbans" = "^det_nom<SN><m><sg><nom>{^আমাদের<det><gen>$ ^সুন্দরবন<np><top><2><3><4>$}$"
6. DET ADJ NOM (chunk: det_adj_nom)
7. PRNSUBJ (chunk: prnsubj)
8. PRNREF (chunk: prnref)
9. VBSER VBLEX (chunk: vbser_vblex)
10. VBHAVER VBLEX (chunk: vbhaver_vblex)
11. VAUX VBLEX (chunk: vaux_vblex)
12. FTAUX BE VBLEX (chunk: ftaux_be_vblex)
13. VBSER PRES (chunk: vbser_pri)
14. VBSER PAST (chunk: vbser_past)
15. VERB CONJ (chunk: verbcj)
16. VBSER PRES (chunk: vbser_pres)
17. VBHAVER VBSER VBLEX (chunk: vbhaver_vbser_vblex)
18. NUM NOM (chunk: num_nom)
19. POST (chunk: post)
20. FTAUX VBHAVER VBLEX (chunk: ftaux_vbhaver_vblex)
21. FTAUX VBHAVER VBSER VBLEX (chunk: ftaux_vbhaver_vbser_vblex)
22. VBDO (chunk: vbdo)
23. VBDO VBLEX (chunk: vbdo_vblex)
24. GERUND (chunk: vbger)
25. FTAUX BE (chunk:ftaux_be)
26. VBHAVER VBSER (chunk:vbhaver_vbser)
27. VBHAVER VBSER VBLEX ADV (chunk:vbhaver_vbser_vblex_adv)
28. FTAUX BE VBLEX ADV (chunk:ftaux_be_vblex_adv)
29. VBSER VBLEX ADV (chunk:vbser_vblex_adv)
30. VBHAVER VBLEX ADV (chunk:vbhaver_vblex_adv)
31. FTAUX VBLEX ADV (chunk:ftaux_vblex_adv)
32. VERB ADV (chunk:verbcj_adv)
33. VBDO VBLEX ADV (chunk:vbdo_vblex_adv)
34. GERUND ADV (chunk:gerund_adv)
35. ADV ADJ NOM (chunk:adv_adj_nom)
36. ADV ADJ (chunk:adv_adj)
37. DET ADV ADJ NOM (chunk:det_adv_adj_nom)
38. ART ADJ NOM (chunk:art_adj_nom)
Issues(en-bn)
- In the rule "SN SV SN" in t2x, case for the last SN is forced <obj>. This is not right all the time; say,
"I eat fish" → "আমি মাছকে খাই"
this should be "আমি মাছ খাই", whereas,
"I love you" → "আমি আপনাকে ভালবাসি"
is ok with case <obj>.