Difference between revisions of "Bengali and English/Transfer Chunker"
(Created page with '=== Rules === 1. '''NOM''' (''chunk'': '''nom''') ''Pattern'' : noun/proper noun ''Action'' : assign <p3><infml> to person ''Chunk'' : nom<SN><[gen]><[nbr]><nom> …') |
|||
(4 intermediate revisions by the same user not shown) | |||
Line 69: | Line 69: | ||
19. '''POST''' (''chunk'': '''post''') |
19. '''POST''' (''chunk'': '''post''') |
||
20. '''FTAUX VBHAVER VBLEX''' (''chunk'': '''ftaux_vbhaver_vblex''') |
|||
21. '''FTAUX VBHAVER VBSER VBLEX''' (''chunk'': '''ftaux_vbhaver_vbser_vblex''') |
|||
22. '''VBDO''' (''chunk'': '''vbdo''') |
|||
23. '''VBDO VBLEX''' (''chunk'': '''vbdo_vblex''') |
|||
24. '''GERUND''' (''chunk'': '''vbger''') |
|||
25. '''FTAUX BE''' (''chunk'':'''ftaux_be''') |
|||
26. '''VBHAVER VBSER''' (''chunk'':'''vbhaver_vbser''') |
|||
27. '''VBHAVER VBSER VBLEX ADV''' (''chunk'':'''vbhaver_vbser_vblex_adv''') |
|||
28. '''FTAUX BE VBLEX ADV''' (''chunk'':'''ftaux_be_vblex_adv''') |
|||
29. '''VBSER VBLEX ADV''' (''chunk'':'''vbser_vblex_adv''') |
|||
30. '''VBHAVER VBLEX ADV''' (''chunk'':'''vbhaver_vblex_adv''') |
|||
31. '''FTAUX VBLEX ADV''' (''chunk'':'''ftaux_vblex_adv''') |
|||
32. '''VERB ADV''' (''chunk'':'''verbcj_adv''') |
|||
33. '''VBDO VBLEX ADV''' (''chunk'':'''vbdo_vblex_adv''') |
|||
34. '''GERUND ADV''' (''chunk'':'''gerund_adv''') |
|||
35. '''ADV ADJ NOM''' (''chunk'':'''adv_adj_nom''') |
|||
36. '''ADV ADJ''' (''chunk'':'''adv_adj''') |
|||
37. '''DET ADV ADJ NOM''' (''chunk'':'''det_adv_adj_nom''') |
|||
38. '''ART ADJ NOM''' (''chunk'':'''art_adj_nom''') |
|||
=== Issues(en-bn) === |
=== Issues(en-bn) === |
||
Line 78: | Line 114: | ||
"I love you" → "আমি আপনাকে ভালবাসি" |
"I love you" → "আমি আপনাকে ভালবাসি" |
||
is ok with case <obj>. |
is ok with case <obj>. |
||
* About the future perfect tense, it should like this: |
|||
"I shall have eaten rice" → "আমি ভাত খেয়ে থাকব" |
|||
But we don't have inflections for the verb 'খ/া' for 'েয়ে থাকব' |
|||
* May be some tagger problem; running the following for pretransfer output with "Zaher plays football": |
|||
echo "Zaher plays football" | /usr/local/bin/lt-proc /usr/local/share/apertium/apertium-bn-en/en-bn.automorf.bin | \ |
|||
/usr/local/bin/apertium-tagger -g /usr/local/share/apertium/apertium-bn-en/en-bn.prob |/usr/local/bin/apertium-pretransfer |
|||
outputs this: |
|||
^Zaher<np><ant><m><sg>$ ^play<n><pl>$ ^football<n><sg>$ |
|||
where 'plays' should be analyzed as verb (play<vblex><pri><p3><sg>). And for this reason, we get such outputs: |
|||
"Zaher plays football" → "জাহের নাটকগুলো ফুটবল" |
|||
Latest revision as of 20:33, 20 August 2011
Rules[edit]
1. NOM (chunk: nom)
Pattern : noun/proper noun Action : assign <p3><infml> to person Chunk : nom<SN><[gen]><[nbr]><nom> Comment : catches the nominals Example : "Bangladesh" = "^nom<SN><mf><sg><nom>{^বাংলাদেশ<np><top><2><3><4>$}$"
2. ADJ NOM (chunk: adj_nom)
Pattern : adjective proper-noun Action : Chunk : adj_nom<SN><[gender]><[number]><nom> Comment : catches nominals with adjective Example : "Beautiful Bangladesh" = "^adj_nom<SN><mf><sg><nom>{^সুন্দর<adj><sint><mf>$ ^বাংলাদেশ<np><top><2><3><4>$}$"
3. ADJ (chunk: adj)
Pattern : adjective Action : Chunk : adj<SN><[adj attribute]><[gender]><nom> Comment : catches adjectives Example : "Beautiful" = "^adj<SN><adj><sint><mf>{^সুন্দর<2><3>$}$"
4. ART NOM (chunk: art_nom)
Pattern : definitive determinant (i.e. 'The') Action : assign <p3><infml> to person, when nominal is noun and singular mark as 'definitive' Chunk : art_nom<SN><[gender]><[number]><nom> Comment : catches nominals with article ('The') Example : "The Sundarbans" = "^art_nom<SN><m><sg><nom>{^সুন্দরবন<np><top><2><3><4>$}$"
5. DET NOM (chunk: det_nom)
Pattern : determinant nominal Action : fix determiner tag, assign <p3><infml> to person Chunk : det_nom<SN><[gender]><[number]><nom> Comment : catches nominals with determiners Example : "Our Sundarbans" = "^det_nom<SN><m><sg><nom>{^আমাদের<det><gen>$ ^সুন্দরবন<np><top><2><3><4>$}$"
6. DET ADJ NOM (chunk: det_adj_nom)
7. PRNSUBJ (chunk: prnsubj)
8. PRNREF (chunk: prnref)
9. VBSER VBLEX (chunk: vbser_vblex)
10. VBHAVER VBLEX (chunk: vbhaver_vblex)
11. VAUX VBLEX (chunk: vaux_vblex)
12. FTAUX BE VBLEX (chunk: ftaux_be_vblex)
13. VBSER PRES (chunk: vbser_pri)
14. VBSER PAST (chunk: vbser_past)
15. VERB CONJ (chunk: verbcj)
16. VBSER PRES (chunk: vbser_pres)
17. VBHAVER VBSER VBLEX (chunk: vbhaver_vbser_vblex)
18. NUM NOM (chunk: num_nom)
19. POST (chunk: post)
20. FTAUX VBHAVER VBLEX (chunk: ftaux_vbhaver_vblex)
21. FTAUX VBHAVER VBSER VBLEX (chunk: ftaux_vbhaver_vbser_vblex)
22. VBDO (chunk: vbdo)
23. VBDO VBLEX (chunk: vbdo_vblex)
24. GERUND (chunk: vbger)
25. FTAUX BE (chunk:ftaux_be)
26. VBHAVER VBSER (chunk:vbhaver_vbser)
27. VBHAVER VBSER VBLEX ADV (chunk:vbhaver_vbser_vblex_adv)
28. FTAUX BE VBLEX ADV (chunk:ftaux_be_vblex_adv)
29. VBSER VBLEX ADV (chunk:vbser_vblex_adv)
30. VBHAVER VBLEX ADV (chunk:vbhaver_vblex_adv)
31. FTAUX VBLEX ADV (chunk:ftaux_vblex_adv)
32. VERB ADV (chunk:verbcj_adv)
33. VBDO VBLEX ADV (chunk:vbdo_vblex_adv)
34. GERUND ADV (chunk:gerund_adv)
35. ADV ADJ NOM (chunk:adv_adj_nom)
36. ADV ADJ (chunk:adv_adj)
37. DET ADV ADJ NOM (chunk:det_adv_adj_nom)
38. ART ADJ NOM (chunk:art_adj_nom)
Issues(en-bn)[edit]
- In the rule "SN SV SN" in t2x, case for the last SN is forced <obj>. This is not right all the time; say,
"I eat fish" → "আমি মাছকে খাই"
this should be "আমি মাছ খাই", whereas,
"I love you" → "আমি আপনাকে ভালবাসি"
is ok with case <obj>.