Difference between revisions of "Bengali and English/Transfer Chunker"

From Apertium
Jump to navigation Jump to search
(Created page with '=== Rules === 1. '''NOM''' (''chunk'': '''nom''') ''Pattern'' : noun/proper noun ''Action'' : assign <p3><infml> to person ''Chunk'' : nom<SN><[gen]><[nbr]><nom> …')
 
 
(4 intermediate revisions by the same user not shown)
Line 69: Line 69:
19. '''POST''' (''chunk'': '''post''')
19. '''POST''' (''chunk'': '''post''')


20. '''FTAUX VBHAVER VBLEX''' (''chunk'': '''ftaux_vbhaver_vblex''')


21. '''FTAUX VBHAVER VBSER VBLEX''' (''chunk'': '''ftaux_vbhaver_vbser_vblex''')

22. '''VBDO''' (''chunk'': '''vbdo''')

23. '''VBDO VBLEX''' (''chunk'': '''vbdo_vblex''')

24. '''GERUND''' (''chunk'': '''vbger''')

25. '''FTAUX BE''' (''chunk'':'''ftaux_be''')

26. '''VBHAVER VBSER''' (''chunk'':'''vbhaver_vbser''')

27. '''VBHAVER VBSER VBLEX ADV''' (''chunk'':'''vbhaver_vbser_vblex_adv''')

28. '''FTAUX BE VBLEX ADV''' (''chunk'':'''ftaux_be_vblex_adv''')

29. '''VBSER VBLEX ADV''' (''chunk'':'''vbser_vblex_adv''')

30. '''VBHAVER VBLEX ADV''' (''chunk'':'''vbhaver_vblex_adv''')

31. '''FTAUX VBLEX ADV''' (''chunk'':'''ftaux_vblex_adv''')

32. '''VERB ADV''' (''chunk'':'''verbcj_adv''')

33. '''VBDO VBLEX ADV''' (''chunk'':'''vbdo_vblex_adv''')

34. '''GERUND ADV''' (''chunk'':'''gerund_adv''')

35. '''ADV ADJ NOM''' (''chunk'':'''adv_adj_nom''')

36. '''ADV ADJ''' (''chunk'':'''adv_adj''')

37. '''DET ADV ADJ NOM''' (''chunk'':'''det_adv_adj_nom''')

38. '''ART ADJ NOM''' (''chunk'':'''art_adj_nom''')


=== Issues(en-bn) ===
=== Issues(en-bn) ===
Line 78: Line 114:
"I love you" → "আমি আপনাকে ভালবাসি"
"I love you" → "আমি আপনাকে ভালবাসি"
is ok with case <obj>.
is ok with case <obj>.


* About the future perfect tense, it should like this:
"I shall have eaten rice" → "আমি ভাত খেয়ে থাকব"
But we don't have inflections for the verb 'খ/া' for 'েয়ে থাকব'


* May be some tagger problem; running the following for pretransfer output with "Zaher plays football":
echo "Zaher plays football" | /usr/local/bin/lt-proc /usr/local/share/apertium/apertium-bn-en/en-bn.automorf.bin | \
/usr/local/bin/apertium-tagger -g /usr/local/share/apertium/apertium-bn-en/en-bn.prob |/usr/local/bin/apertium-pretransfer
outputs this:
^Zaher<np><ant><m><sg>$ ^play<n><pl>$ ^football<n><sg>$
where 'plays' should be analyzed as verb (play<vblex><pri><p3><sg>). And for this reason, we get such outputs:
"Zaher plays football" → "জাহের নাটকগুলো ফুটবল"






Latest revision as of 20:33, 20 August 2011

Rules[edit]

1. NOM (chunk: nom)

  Pattern : noun/proper noun
  Action  : assign <p3><infml> to person
  Chunk   : nom<SN><[gen]><[nbr]><nom>
  Comment : catches the nominals
  Example : "Bangladesh" = "^nom<SN><mf><sg><nom>{^বাংলাদেশ<np><top><2><3><4>$}$"


2. ADJ NOM (chunk: adj_nom)

  Pattern : adjective proper-noun
  Action  : 
  Chunk   : adj_nom<SN><[gender]><[number]><nom>
  Comment : catches nominals with adjective
  Example : "Beautiful Bangladesh" = "^adj_nom<SN><mf><sg><nom>{^সুন্দর<adj><sint><mf>$ ^বাংলাদেশ<np><top><2><3><4>$}$"


3. ADJ (chunk: adj)

  Pattern : adjective
  Action  : 
  Chunk   : adj<SN><[adj attribute]><[gender]><nom>
  Comment : catches adjectives
  Example : "Beautiful" = "^adj<SN><adj><sint><mf>{^সুন্দর<2><3>$}$"


4. ART NOM (chunk: art_nom)

  Pattern : definitive determinant (i.e. 'The')
  Action  : assign <p3><infml> to person, when nominal is noun and singular mark as 'definitive'
  Chunk   : art_nom<SN><[gender]><[number]><nom>
  Comment : catches nominals with article ('The')
  Example : "The Sundarbans" = "^art_nom<SN><m><sg><nom>{^সুন্দরবন<np><top><2><3><4>$}$"


5. DET NOM (chunk: det_nom)

  Pattern : determinant nominal
  Action  : fix determiner tag, assign <p3><infml> to person
  Chunk   : det_nom<SN><[gender]><[number]><nom>
  Comment : catches nominals with determiners
  Example : "Our Sundarbans" = "^det_nom<SN><m><sg><nom>{^আমাদের<det><gen>$ ^সুন্দরবন<np><top><2><3><4>$}$"


6. DET ADJ NOM (chunk: det_adj_nom)

7. PRNSUBJ (chunk: prnsubj)

8. PRNREF (chunk: prnref)

9. VBSER VBLEX (chunk: vbser_vblex)

10. VBHAVER VBLEX (chunk: vbhaver_vblex)

11. VAUX VBLEX (chunk: vaux_vblex)

12. FTAUX BE VBLEX (chunk: ftaux_be_vblex)

13. VBSER PRES (chunk: vbser_pri)

14. VBSER PAST (chunk: vbser_past)

15. VERB CONJ (chunk: verbcj)

16. VBSER PRES (chunk: vbser_pres)

17. VBHAVER VBSER VBLEX (chunk: vbhaver_vbser_vblex)

18. NUM NOM (chunk: num_nom)

19. POST (chunk: post)

20. FTAUX VBHAVER VBLEX (chunk: ftaux_vbhaver_vblex)

21. FTAUX VBHAVER VBSER VBLEX (chunk: ftaux_vbhaver_vbser_vblex)

22. VBDO (chunk: vbdo)

23. VBDO VBLEX (chunk: vbdo_vblex)

24. GERUND (chunk: vbger)

25. FTAUX BE (chunk:ftaux_be)

26. VBHAVER VBSER (chunk:vbhaver_vbser)

27. VBHAVER VBSER VBLEX ADV (chunk:vbhaver_vbser_vblex_adv)

28. FTAUX BE VBLEX ADV (chunk:ftaux_be_vblex_adv)

29. VBSER VBLEX ADV (chunk:vbser_vblex_adv)

30. VBHAVER VBLEX ADV (chunk:vbhaver_vblex_adv)

31. FTAUX VBLEX ADV (chunk:ftaux_vblex_adv)

32. VERB ADV (chunk:verbcj_adv)

33. VBDO VBLEX ADV (chunk:vbdo_vblex_adv)

34. GERUND ADV (chunk:gerund_adv)

35. ADV ADJ NOM (chunk:adv_adj_nom)

36. ADV ADJ (chunk:adv_adj)

37. DET ADV ADJ NOM (chunk:det_adv_adj_nom)

38. ART ADJ NOM (chunk:art_adj_nom)

Issues(en-bn)[edit]

  • In the rule "SN SV SN" in t2x, case for the last SN is forced <obj>. This is not right all the time; say,
  "I eat fish" → "আমি মাছকে খাই"

this should be "আমি মাছ খাই", whereas,

  "I love you" → "আমি আপনাকে ভালবাসি"

is ok with case <obj>.