Difference between revisions of "User:David Nemeskey/GSOC progress 2013"
Jump to navigation
Jump to search
Line 15: | Line 15: | ||
* Study the CG grammar of an Apertium language. |
* Study the CG grammar of an Apertium language. |
||
* Write a Hungarian grammar that covers the sentences in [https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-hun-eng/texts/rasskaz.hun.txt this sample file] |
* Write a Hungarian grammar that covers the sentences in [https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-hun-eng/texts/rasskaz.hun.txt this sample file] |
||
* The tags will be based on those in KR-code<ref>András Kornai, Péter Rebrus, Péter Vajda, Péter Halácsy, András Rung, Viktor Trón. 2004. Általános célú morfológiai elemző kimeneti formalizmusa (The output formalism of a general-purpose morphological analyzer). In: Proceedings of the 2nd Hungarian Computational Linguistics Conference |
* The tags will be based on those in KR-code<ref>András Kornai, Péter Rebrus, Péter Vajda, Péter Halácsy, András Rung, Viktor Trón. 2004. Általános célú morfológiai elemző kimeneti formalizmusa (The output formalism of a general-purpose morphological analyzer). In: Proceedings of the 2nd Hungarian Computational Linguistics Conference.</ref>. See the [[#Hunmorph converter|next task]]. |
||
.</ref>. See the next task. |
|||
==== Hunmorph converter ==== |
|||
Write a converter from ocamorph's output to Apertium's format. |
|||
* Again, use the sentences in [https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-hun-eng/texts/rasskaz.hun.txt this sample file] as reference. |
|||
* While a C-based converter would definitely be possible, I opted for a foma-based (xfst -- lexc?) implementation, so that this task also serves for practice. |
|||
==== ATT -> lttoolbox compiler ==== |
|||
Write an ATT FST format reading for lttoolbox. |
|||
== References == |
|||
<references/> |
Revision as of 08:33, 29 May 2013
Contents
Tasks
XML format
Compiler
Miscellaneous / Extra
Hungarian CG grammar
Write a simple CG grammar for Hungarian, somewhere around 50-150 rules.
- Read Pasi Tapnainen's The Constraint Grammar Parser CG-2.
- Read the contents of cg_material.zip.
- Study the CG grammar of an Apertium language.
- Write a Hungarian grammar that covers the sentences in this sample file
- The tags will be based on those in KR-code[1]. See the next task.
Hunmorph converter
Write a converter from ocamorph's output to Apertium's format.
- Again, use the sentences in this sample file as reference.
- While a C-based converter would definitely be possible, I opted for a foma-based (xfst -- lexc?) implementation, so that this task also serves for practice.
ATT -> lttoolbox compiler
Write an ATT FST format reading for lttoolbox.
References
- ↑ András Kornai, Péter Rebrus, Péter Vajda, Péter Halácsy, András Rung, Viktor Trón. 2004. Általános célú morfológiai elemző kimeneti formalizmusa (The output formalism of a general-purpose morphological analyzer). In: Proceedings of the 2nd Hungarian Computational Linguistics Conference.