Difference between revisions of "Ideas for Google Summer of Code/Robust recursive transfer"

From Apertium
Jump to navigation Jump to search
 
(16 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
{{TOCD}}
 
{{TOCD}}
  +
  +
The purpose of this task is to create a module to replace the apertium-transfer module(s) which will parse and allow transfer operations on an input.
  +
  +
Currently we have a problem with very distantly related languages that have long-distance constituent reordering, because we can only do finite-state chunking. The module should be designed to be able to work cleanly with partial input. e.g. word by word processing, not sentence by sentence.
  +
  +
It should expect morphologically disambiguated input, and its own output should also be unambiguous (it should create a single parse tree).
   
 
==Tasks==
 
==Tasks==
   
# Do a review of the literature on finite-state dependency parsing
+
# Do a review of the literature on:
  +
## finite-state dependency parsing
  +
## LALR(1) grammars
  +
# Propose a transfer rule formalism
  +
# Write a number of transfer rules in this formalism for translating between a language pair.
  +
# Reimplement an existing language pair in trunk using your new formalism. This will involve rewriting the existing rules to be compatible with your new formalism.
  +
# Integrate your new rules into the existing pair.
  +
# Evaluate the improvement
   
 
==Coding challenge==
 
==Coding challenge==
   
 
# Install Apertium (see [[Minimal installation from SVN]])
 
# Install Apertium (see [[Minimal installation from SVN]])
  +
# Compile the prototype code at [[recursive transfer]].
# Parse one or more sentences from the [http://www.unilang.org/ulrview.php?res=394,387 story] in your language by hand.
 
  +
# Write a transfer grammar to perform word-reordering for this [http://www.unilang.org/ulrview.php?res=394,387 story] (other link [https://svn.code.sf.net/p/apertium/svn/branches/xupaixkar/rasskaz/ here]) for your chosen language pair.
# Formalise some rules to show how the parsed representation could be converted to a representation suitable for generation in another language.
 
  +
  +
; Optional
  +
  +
# Adjust prototype code to include support for attributes.
   
 
==Frequently asked questions==
 
==Frequently asked questions==
  +
* none yet, ''[[contact|ask us]] something!'' :)
  +
  +
==See also==
   
  +
* (2011) [[VM for transfer]]: Relevant to understand how the current transfer implementation works
==Previous GSOC projects==
 
  +
* [[Recursive transfer]]
  +
* [[User:Mlforcada/Robust LR for Transfer]]
   
 
==Further reading==
 
==Further reading==

Latest revision as of 19:16, 28 February 2019

The purpose of this task is to create a module to replace the apertium-transfer module(s) which will parse and allow transfer operations on an input.

Currently we have a problem with very distantly related languages that have long-distance constituent reordering, because we can only do finite-state chunking. The module should be designed to be able to work cleanly with partial input. e.g. word by word processing, not sentence by sentence.

It should expect morphologically disambiguated input, and its own output should also be unambiguous (it should create a single parse tree).

Tasks[edit]

  1. Do a review of the literature on:
    1. finite-state dependency parsing
    2. LALR(1) grammars
  2. Propose a transfer rule formalism
  3. Write a number of transfer rules in this formalism for translating between a language pair.
  4. Reimplement an existing language pair in trunk using your new formalism. This will involve rewriting the existing rules to be compatible with your new formalism.
  5. Integrate your new rules into the existing pair.
  6. Evaluate the improvement

Coding challenge[edit]

  1. Install Apertium (see Minimal installation from SVN)
  2. Compile the prototype code at recursive transfer.
  3. Write a transfer grammar to perform word-reordering for this story (other link here) for your chosen language pair.
Optional
  1. Adjust prototype code to include support for attributes.

Frequently asked questions[edit]

  • none yet, ask us something! :)

See also[edit]

Further reading[edit]

  • Elworthy, D. (1999) "A Finite-State Parser with Dependency Structure Output"
  • Öflazer, K. (1999) "Dependency Parsing with an Extended Finite State Approach"
  • Alshawi, H., Douglas, S., Bangalore, S. (2000) "Learning Dependency Translation Models as Collections of Finite-State Head Transducers". Computational Linguistics 26(1)