Difference between revisions of "User:Mlforcada/Robust LR for Transfer"

From Apertium
Jump to navigation Jump to search
m (q)
Line 15: Line 15:
S : NP VP { write(agree(nom($1),$2)) };
S : NP VP { write(agree(nom($1),$2)) };
NP : n { write($1) } ;
NP : n { write($1) } ;
VP : v { write($1) } ;
VP : v { write($1) }
| v NP { write(acc($2),$1) };
| v NP { write(acc($2),$1) };
</pre>
</pre>
the sequence <math>v n n</math> could not be parsed and would lead to an error. However, it could be seen as a NP followed by a VP, and a translation could be generated for both. One could augment the grammar by adding some rules:
the sequence <math>v n n</math> could not be parsed and would lead to an error. However, it could be seen as a NP followed by a VP, and a translation could be generated for both. One could augment the grammar by systematically adding some rules:
<pre>
<pre>
S : NP VP { write(agree(nom($1),$2)) }
S : NP VP { write(agree(nom($1),$2)) }
Line 27: Line 27:
| ; # the remaining part may be empty
| ; # the remaining part may be empty
NP : n { write($1) } ;
NP : n { write($1) } ;
VP : v { write($1) } ;
VP : v { write($1) }
| v NP { write(acc($2),$1) };
| v NP { write(acc($2),$1) };
</pre>
</pre>

Revision as of 16:44, 1 January 2015

We need a way to implement Apertium4 Bison grammars that are robust.

Here, Bison GRAMMAR is a LALR(1) grammar with actions between braces.

There should be a way (or a procedure) to complete handwritten rules, if possible automatically, to generate a robust parser (and translator). The idea is to take the handwritten Bison grammar and complement it with automatically-generated glue rules in such a way that conflicts are not produced (or are harmless) to produce a new grammar .

One possible way to do so is to model left-to-right restarts as follows:

  • by accepting the longest possible constituent (in the original grammar that cannot be merged with the remaining output and generating some kind of translation for it
  • and treating the remaining output as a complete sentence again

For instance, if the grammar is:

S : NP VP  { write(agree(nom($1),$2)) };
NP : n   { write($1) } ;
VP : v  { write($1) } 
   | v NP  { write(acc($2),$1) };

the sequence could not be parsed and would lead to an error. However, it could be seen as a NP followed by a VP, and a translation could be generated for both. One could augment the grammar by systematically adding some rules:

S : NP VP  { write(agree(nom($1),$2)) }
  | NP TryS { write($1); write($2) }      # translate an NP and skip
  | VP TryS { write($1); write($2) }      # translate a VP and skip
  ;
TryS : S  { write($1); }
     | ;                                  # the remaining part may be empty
NP : n   { write($1) } ;
VP : v  { write($1) } 
   | v NP  { write(acc($2),$1) };