Difference between revisions of "Compounds"
Jump to navigation
Jump to search
Line 11: | Line 11: | ||
* Koehn, P. and Knight, K. (2003) "[http://www.iccs.inf.ed.ac.uk/~pkoehn/publications/compound2003.pdf Empirical Methods for Compound Splitting]". ''11th Conference of the European Chapter of the Association for Computational Linguistics'', (EACL2003). |
* Koehn, P. and Knight, K. (2003) "[http://www.iccs.inf.ed.ac.uk/~pkoehn/publications/compound2003.pdf Empirical Methods for Compound Splitting]". ''11th Conference of the European Chapter of the Association for Computational Linguistics'', (EACL2003). |
||
* Brown, R. (2002) "[http://www.eamt.org/archive/tmi2002/conference/02_brown.pdf Corpus-Driven Splitting of Compound Words]". ''TMI 2002'' |
* Brown, R. (2002) "[http://www.eamt.org/archive/tmi2002/conference/02_brown.pdf Corpus-Driven Splitting of Compound Words]". ''TMI 2002'' |
||
* Larson, M., Willett, D., Köhler, J. and Rigoll, G. (2000) "[http://citeseer.ist.psu.edu/rd/1754835%2C317286%2C1%2C0.25%2CDownload/http://citeseer.ist.psu.edu/cache/papers/cs/15664/http:zSzzSzwww.fb9-ti.uni-duisburg.dezSzpublzSz00zSzicslp00ml_compounds.pdf/larson00compound.pdf Compound splitting and lexical unit recombination for improved performance of a speech recognition system for German parliamentary speeches]". Conference on Spoken Language Processing, 2000. |
Revision as of 10:02, 10 June 2007
Some languages (in Indo-European particularly Germanic languages) like to make long compound words with low frequency that are unlikely to be found in dictionaries.
- Afrikaans: footboodskaap, foot+boodskaap ("error message"), (cf. groeteboodskap, "greeting message")
- Dutch : "hulpagina" (help page), "woordbetekenis" (meaning of a word),
- German: Kontaktlinsenverträglichkeitstest, Kontakt+linsen+verträglichkeits+test ("contact-lens compatibility test")
Perhaps there could be some method of attempting to resolve unknown compound words into their constituent parts.
Further reading
- Koehn, P. and Knight, K. (2003) "Empirical Methods for Compound Splitting". 11th Conference of the European Chapter of the Association for Computational Linguistics, (EACL2003).
- Brown, R. (2002) "Corpus-Driven Splitting of Compound Words". TMI 2002
- Larson, M., Willett, D., Köhler, J. and Rigoll, G. (2000) "Compound splitting and lexical unit recombination for improved performance of a speech recognition system for German parliamentary speeches". Conference on Spoken Language Processing, 2000.