Difference between revisions of "Earley-based structural transfer for Apertium"
Jump to navigation
Jump to search
(Added some more ideas) |
|||
(15 intermediate revisions by 8 users not shown) | |||
Line 1: | Line 1: | ||
− | Perhaps [ |
+ | Perhaps [http://en.wikipedia.org/wiki/Earley's_algorithm Earley's algorithm] to parse context-free grammars (which has a left-to-right longest-match philosophy as Apertium) could be used to perform more complex syntactical transformations; this could be useful for distant language pairs containing embedded structures. |
− | Open questions |
+ | ==Open questions== |
* Currently, Apertium uses text streams to communicate. I assume this would not be possible here. |
* Currently, Apertium uses text streams to communicate. I assume this would not be possible here. |
||
− | * When would one call the bilingual dictionary? Apertium Level 2 calls it in the first stage. |
+ | * <s>When would one call the bilingual dictionary? Apertium Level 2 calls it in the first stage.</s> |
* We should check whether this has been done before. |
* We should check whether this has been done before. |
||
+ | ::The English → Urdu translation system linked [[Specific resources per language#Urdu|here]] seems to use LFG and Earley-based parsing. |
||
+ | * In case there is more than one parse of a sentence, there should be a way to select the most likely. |
||
+ | |||
+ | ==Existing parsers== |
||
+ | {{main|Parsers}} |
||
+ | Current free-software parsers which might be worth looking at: |
||
+ | |||
+ | * [http://www.agfl.cs.ru.nl/ AGFL parser] (GPL) |
||
+ | |||
+ | ==Further reading== |
||
+ | |||
+ | * Koichi Takeda [http://acl.ldc.upenn.edu/P/P96/P96-1020.pdf Pattern-Based Context-Free Grammars for Machine Translation] (private access) |
||
+ | :This paper proposes the use of "pattern-based" context-free grammars as a basis for building machine translation (MT) systems. |
||
+ | * Randall Sharp and Oliver Streiter [http://www.iai.uni-sb.de/docs/meta93.pdf Simplifying the Complexity of Machine Translation] |
||
+ | * J. Earley, (1970) "[http://portal.acm.org/citation.cfm?doid=362007.362035 An efficient context-free parsing algorithm]", ''Communications of the Association for Computing Machinery'', 13:2:94--102, 1970. |
||
+ | |||
+ | |||
+ | [[Category:Development]] |
||
+ | [[Category:Documentation in English]] |
||
+ | [[Category:Transfer]] |
Latest revision as of 21:21, 2 October 2013
Perhaps Earley's algorithm to parse context-free grammars (which has a left-to-right longest-match philosophy as Apertium) could be used to perform more complex syntactical transformations; this could be useful for distant language pairs containing embedded structures.
Open questions[edit]
- Currently, Apertium uses text streams to communicate. I assume this would not be possible here.
When would one call the bilingual dictionary? Apertium Level 2 calls it in the first stage.- We should check whether this has been done before.
- The English → Urdu translation system linked here seems to use LFG and Earley-based parsing.
- In case there is more than one parse of a sentence, there should be a way to select the most likely.
Existing parsers[edit]
- Main article: Parsers
Current free-software parsers which might be worth looking at:
- AGFL parser (GPL)
Further reading[edit]
- Koichi Takeda Pattern-Based Context-Free Grammars for Machine Translation (private access)
- This paper proposes the use of "pattern-based" context-free grammars as a basis for building machine translation (MT) systems.
- Randall Sharp and Oliver Streiter Simplifying the Complexity of Machine Translation
- J. Earley, (1970) "An efficient context-free parsing algorithm", Communications of the Association for Computing Machinery, 13:2:94--102, 1970.