Difference between revisions of "Ideas for Google Summer of Code/Complex multiwords"
Jump to navigation
Jump to search
m (→Coding challenge: remove coding challenge; I'll replace it with one I'm interested in if/when I think of one) |
|||
Line 6: | Line 6: | ||
==Coding challenge== |
==Coding challenge== |
||
* Write a stream processor (see [[Apertium stream format]]) for the output of apertium-tagger -p -g that parses character by character, respecting [[superblanks]]. |
|||
==Frequently asked questions== |
==Frequently asked questions== |
Revision as of 18:59, 16 March 2012
Write a bidirectional module for specifying complex multiword units, for example dirección general and zračna luka. Although in the Romance languages it is not a big problem, as soon as you start to get to languages with cases (e.g. Serbo-Croatian, Slovenian, German, Icelandic, etc.) the problem comes that you can't define a multiword of adj nom because the adjective has a lot of inflection.
The module should be bidirectional, that is, it should be able to be used for both analysing and for generating these multiwords.