Difference between revisions of "Bengali and English/Transfer Chunker"
(→Rules) |
|||
Line 69: | Line 69: | ||
19. '''POST''' (''chunk'': '''post''') |
19. '''POST''' (''chunk'': '''post''') |
||
20. '''FTAUX VBHAVER VBLEX''' (''chunk'': '''ftaux_vbhaver_vblex''') |
|||
21. '''FTAUX VBHAVER VBSER VBLEX''' (''chunk'': '''ftaux_vbhaver_vbser_vblex''') |
|||
22. '''VBDO''' (''chunk'': '''vbdo''') |
|||
23. '''VBDO VBLEX''' (''chunk'': '''vbdo_vblex''') |
|||
24. '''GERUND''' (''chunk'': '''vbger''') |
|||
=== Issues(en-bn) === |
=== Issues(en-bn) === |
Revision as of 15:25, 15 August 2011
Rules
1. NOM (chunk: nom)
Pattern : noun/proper noun Action : assign <p3><infml> to person Chunk : nom<SN><[gen]><[nbr]><nom> Comment : catches the nominals Example : "Bangladesh" = "^nom<SN><mf><sg><nom>{^বাংলাদেশ<np><top><2><3><4>$}$"
2. ADJ NOM (chunk: adj_nom)
Pattern : adjective proper-noun Action : Chunk : adj_nom<SN><[gender]><[number]><nom> Comment : catches nominals with adjective Example : "Beautiful Bangladesh" = "^adj_nom<SN><mf><sg><nom>{^সুন্দর<adj><sint><mf>$ ^বাংলাদেশ<np><top><2><3><4>$}$"
3. ADJ (chunk: adj)
Pattern : adjective Action : Chunk : adj<SN><[adj attribute]><[gender]><nom> Comment : catches adjectives Example : "Beautiful" = "^adj<SN><adj><sint><mf>{^সুন্দর<2><3>$}$"
4. ART NOM (chunk: art_nom)
Pattern : definitive determinant (i.e. 'The') Action : assign <p3><infml> to person, when nominal is noun and singular mark as 'definitive' Chunk : art_nom<SN><[gender]><[number]><nom> Comment : catches nominals with article ('The') Example : "The Sundarbans" = "^art_nom<SN><m><sg><nom>{^সুন্দরবন<np><top><2><3><4>$}$"
5. DET NOM (chunk: det_nom)
Pattern : determinant nominal Action : fix determiner tag, assign <p3><infml> to person Chunk : det_nom<SN><[gender]><[number]><nom> Comment : catches nominals with determiners Example : "Our Sundarbans" = "^det_nom<SN><m><sg><nom>{^আমাদের<det><gen>$ ^সুন্দরবন<np><top><2><3><4>$}$"
6. DET ADJ NOM (chunk: det_adj_nom)
7. PRNSUBJ (chunk: prnsubj)
8. PRNREF (chunk: prnref)
9. VBSER VBLEX (chunk: vbser_vblex)
10. VBHAVER VBLEX (chunk: vbhaver_vblex)
11. VAUX VBLEX (chunk: vaux_vblex)
12. FTAUX BE VBLEX (chunk: ftaux_be_vblex)
13. VBSER PRES (chunk: vbser_pri)
14. VBSER PAST (chunk: vbser_past)
15. VERB CONJ (chunk: verbcj)
16. VBSER PRES (chunk: vbser_pres)
17. VBHAVER VBSER VBLEX (chunk: vbhaver_vbser_vblex)
18. NUM NOM (chunk: num_nom)
19. POST (chunk: post)
20. FTAUX VBHAVER VBLEX (chunk: ftaux_vbhaver_vblex)
21. FTAUX VBHAVER VBSER VBLEX (chunk: ftaux_vbhaver_vbser_vblex)
22. VBDO (chunk: vbdo)
23. VBDO VBLEX (chunk: vbdo_vblex)
24. GERUND (chunk: vbger)
Issues(en-bn)
- In the rule "SN SV SN" in t2x, case for the last SN is forced <obj>. This is not right all the time; say,
"I eat fish" → "আমি মাছকে খাই"
this should be "আমি মাছ খাই", whereas,
"I love you" → "আমি আপনাকে ভালবাসি"
is ok with case <obj>.
- About the future perfect tense, it should like this:
"I shall have eaten rice" → "আমি ভাত খেয়ে থাকব"
But we don't have inflections for the verb 'খ/া' for 'েয়ে থাকব'
- May be some tagger problem; running the following for pretransfer output with "Zaher plays football":
echo "Zaher plays football" | apertium -d . en-bn-tagger
outputs this:
^Zaher<np><ant><m><sg>$ ^play<n><pl>$ ^football<n><sg>$
where 'plays' should be analyzed as verb (play<vblex><pri><p3><sg>). And for this reason, we get such outputs:
"Zaher plays football" → "জাহের নাটকগুলো ফুটবল"