Difference between revisions of "User:Mlforcada/Robust LR for Transfer"
(Created page with "We need a way to implement Apertium4 Bison grammars that are robust. Here, '''Bison GRAMMAR''' is a LALR(1) grammar with actions between braces. There should be a way (or a ...") |
m (q) |
||
Line 8: | Line 8: | ||
One possible way to do so is to model left-to-right '''restarts''' as follows: |
One possible way to do so is to model left-to-right '''restarts''' as follows: |
||
− | * by accepting the longest possible constituent that cannot be merged with the remaining output and generating some kind of translation for it |
+ | * by accepting the longest possible constituent (in the original grammar <math>G</math> that cannot be merged with the remaining output and generating some kind of translation for it |
− | * and treating the remaining output as a complete sentence again |
+ | * and treating the remaining output as a complete sentence again |
+ | |||
+ | For instance, if the grammar <math>G</math> is: |
||
+ | <pre> |
||
+ | S : NP VP { write(agree(nom($1),$2)) }; |
||
+ | NP : n { write($1) } ; |
||
+ | VP : v { write($1) } ; |
||
+ | | v NP { write(acc($2),$1) }; |
||
+ | </pre> |
||
+ | the sequence <math>v n n</math> could not be parsed and would lead to an error. However, it could be seen as a NP followed by a VP, and a translation could be generated for both. One could augment the grammar by adding some rules: |
||
+ | <pre> |
||
+ | S : NP VP { write(agree(nom($1),$2)) } |
||
+ | | NP TryS { write($1); write($2) } # translate an NP and skip |
||
+ | | VP TryS { write($1); write($2) } # translate a VP and skip |
||
+ | ; |
||
+ | TryS : S { write($1); } |
||
+ | | ; # the remaining part may be empty |
||
+ | NP : n { write($1) } ; |
||
+ | VP : v { write($1) } ; |
||
+ | | v NP { write(acc($2),$1) }; |
||
+ | </pre> |
Revision as of 16:42, 1 January 2015
We need a way to implement Apertium4 Bison grammars that are robust.
Here, Bison GRAMMAR is a LALR(1) grammar with actions between braces.
There should be a way (or a procedure) to complete handwritten rules, if possible automatically, to generate a robust parser (and translator). The idea is to take the handwritten Bison grammar and complement it with automatically-generated glue rules in such a way that conflicts are not produced (or are harmless) to produce a new grammar .
One possible way to do so is to model left-to-right restarts as follows:
- by accepting the longest possible constituent (in the original grammar that cannot be merged with the remaining output and generating some kind of translation for it
- and treating the remaining output as a complete sentence again
For instance, if the grammar is:
S : NP VP { write(agree(nom($1),$2)) }; NP : n { write($1) } ; VP : v { write($1) } ; | v NP { write(acc($2),$1) };
the sequence could not be parsed and would lead to an error. However, it could be seen as a NP followed by a VP, and a translation could be generated for both. One could augment the grammar by adding some rules:
S : NP VP { write(agree(nom($1),$2)) } | NP TryS { write($1); write($2) } # translate an NP and skip | VP TryS { write($1); write($2) } # translate a VP and skip ; TryS : S { write($1); } | ; # the remaining part may be empty NP : n { write($1) } ; VP : v { write($1) } ; | v NP { write(acc($2),$1) };