Difference between revisions of "Earley-based structural transfer for Apertium"

From Apertium
Jump to navigation Jump to search
 
(3 intermediate revisions by 2 users not shown)
Line 4: Line 4:
   
 
* Currently, Apertium uses text streams to communicate. I assume this would not be possible here.
 
* Currently, Apertium uses text streams to communicate. I assume this would not be possible here.
* When would one call the bilingual dictionary? Apertium Level 2 calls it in the first stage.
+
* <s>When would one call the bilingual dictionary? Apertium Level 2 calls it in the first stage.</s>
 
* We should check whether this has been done before.
 
* We should check whether this has been done before.
::The English → Urdu translation system linked [[Incubator#Urdu|here]] seems to use LFG and Earley-based parsing.
+
::The English → Urdu translation system linked [[Specific resources per language#Urdu|here]] seems to use LFG and Earley-based parsing.
 
* In case there is more than one parse of a sentence, there should be a way to select the most likely.
 
* In case there is more than one parse of a sentence, there should be a way to select the most likely.
   
Line 24: Line 24:
   
 
[[Category:Development]]
 
[[Category:Development]]
  +
[[Category:Documentation in English]]
  +
[[Category:Transfer]]

Latest revision as of 21:21, 2 October 2013

Perhaps Earley's algorithm to parse context-free grammars (which has a left-to-right longest-match philosophy as Apertium) could be used to perform more complex syntactical transformations; this could be useful for distant language pairs containing embedded structures.

Open questions[edit]

  • Currently, Apertium uses text streams to communicate. I assume this would not be possible here.
  • When would one call the bilingual dictionary? Apertium Level 2 calls it in the first stage.
  • We should check whether this has been done before.
The English → Urdu translation system linked here seems to use LFG and Earley-based parsing.
  • In case there is more than one parse of a sentence, there should be a way to select the most likely.

Existing parsers[edit]

Main article: Parsers

Current free-software parsers which might be worth looking at:

Further reading[edit]

This paper proposes the use of "pattern-based" context-free grammars as a basis for building machine translation (MT) systems.