Ideas for Google Summer of Code/Robust recursive transfer
Jump to navigation
Jump to search
The purpose of this task is to create a prototype module to replace the apertium-transfer module(s) which will parse and allow transfer operations on an input. Currently we have a problem with very distantly related languages that have long-distance constituent reordering, because we can only do finite-state chunking. The module should be designed to be able to work cleanly with partial input. e.g. word by word processing, not sentence by sentence.
Tasks
- Do a review of the literature on finite-state dependency parsing
Coding challenge
- Install Apertium (see Minimal installation from SVN)
- Parse one or more sentences from the story in your language by hand.
- Formalise some rules to show how the parsed representation could be converted to a representation suitable for generation in another language.
- Write a stream processor (see Apertium stream format) that takes as input the output of the lexical transfer module and processes character by character.
Frequently asked questions
Previous GSOC projects
- (2011) VM for transfer: Relevant to understand how the current transfer implementation works
Further reading
- Elworthy, D. (1999) "A Finite-State Parser with Dependency Structure Output"
- Öflazer, K. (1999) "Dependency Parsing with an Extended Finite State Approach"
- Alshawi, H., Douglas, S., Bangalore, S. (2000) "Learning Dependency Translation Models as Collections of Finite-State Head Transducers". Computational Linguistics 26(1)