Difference between revisions of "Bengali and English/Transfer"
Line 1: | Line 1: | ||
== Rules(en-bn) == |
== Rules(en-bn.t1x) == |
||
1. '''NOM''' (''chunk'': '''nom''') |
1. '''NOM''' (''chunk'': '''nom''') |
||
Line 67: | Line 67: | ||
18. '''NUM NOM''' (''chunk'': '''num_nom''') |
18. '''NUM NOM''' (''chunk'': '''num_nom''') |
||
19. '''POST''' (''chunk'': '''post''') |
|||
== Issues(en-bn) == |
== Issues(en-bn) == |
Revision as of 19:14, 10 August 2011
Rules(en-bn.t1x)
1. NOM (chunk: nom)
Pattern : noun/proper noun Action : assign <p3><infml> to person Chunk : nom<SN><[gen]><[nbr]><nom> Comment : catches the nominals Example : "Bangladesh" = "^nom<SN><mf><sg><nom>{^বাংলাদেশ<np><top><2><3><4>$}$"
2. ADJ NOM (chunk: adj_nom)
Pattern : adjective proper-noun Action : Chunk : adj_nom<SN><[gender]><[number]><nom> Comment : catches nominals with adjective Example : "Beautiful Bangladesh" = "^adj_nom<SN><mf><sg><nom>{^সুন্দর<adj><sint><mf>$ ^বাংলাদেশ<np><top><2><3><4>$}$"
3. ADJ (chunk: adj)
Pattern : adjective Action : Chunk : adj<SN><[adj attribute]><[gender]><nom> Comment : catches adjectives Example : "Beautiful" = "^adj<SN><adj><sint><mf>{^সুন্দর<2><3>$}$"
4. ART NOM (chunk: art_nom)
Pattern : definitive determinant (i.e. 'The') Action : assign <p3><infml> to person, when nominal is noun and singular mark as 'definitive' Chunk : art_nom<SN><[gender]><[number]><nom> Comment : catches nominals with article ('The') Example : "The Sundarbans" = "^art_nom<SN><m><sg><nom>{^সুন্দরবন<np><top><2><3><4>$}$"
5. DET NOM (chunk: det_nom)
Pattern : determinant nominal Action : fix determiner tag, assign <p3><infml> to person Chunk : det_nom<SN><[gender]><[number]><nom> Comment : catches nominals with determiners Example : "Our Sundarbans" = "^det_nom<SN><m><sg><nom>{^আমাদের<det><gen>$ ^সুন্দরবন<np><top><2><3><4>$}$"
6. DET ADJ NOM (chunk: det_adj_nom)
7. PRNSUBJ (chunk: prnsubj)
8. PRNREF (chunk: prnref)
9. VBSER VBLEX (chunk: vbser_vblex)
10. VBHAVER VBLEX (chunk: vbhaver_vblex)
11. VAUX VBLEX (chunk: vaux_vblex)
12. FTAUX BE VBLEX (chunk: ftaux_be_vblex)
13. VBSER PRES (chunk: vbser_pri)
14. VBSER PAST (chunk: vbser_past)
15. VERB CONJ (chunk: verbcj)
16. VBSER PRES (chunk: vbser_pres)
17. VBHAVER VBSER VBLEX (chunk: vbhaver_vbser_vblex)
18. NUM NOM (chunk: num_nom)
19. POST (chunk: post)
Issues(en-bn)
- In the rule "SN SV SN" in t2x, case for the last SN is forced <obj>. This is not right all the time; say,
"I eat fish" → "আমি মাছকে খাই"
this should be "আমি মাছ খাই", whereas,
"I love you" → "আমি আপনাকে ভালবাসি"
is ok with case <obj>.
- About the future perfect tense, it should like this:
"I shall have eaten rice" → "আমি ভাত খেয়ে থাকব"
But we don't have inflections for the verb 'খ/া' for 'েয়ে থাকব'
- May be some tagger problem; running the following for pretransfer output with "Zaher plays football":
echo "Zaher plays football" | /usr/local/bin/lt-proc /usr/local/share/apertium/apertium-bn-en/en-bn.automorf.bin | \ /usr/local/bin/apertium-tagger -g /usr/local/share/apertium/apertium-bn-en/en-bn.prob |/usr/local/bin/apertium-pretransfer
outputs this:
^Zaher<np><ant><m><sg>$ ^play<n><pl>$ ^football<n><sg>$
where 'plays' should be analyzed as verb (play<vblex><pri><p3><sg>). And for this reason, we get such outputs:
"Zaher plays football" → "জাহের নাটকগুলো ফুটবল"