N-Stage transfer
The idea of n-Stage transfer is to extend the apertium-interchunk
so that it can output the same format as it inputs which would allow it to be called more than once, it would also be good to be able to "merge" chunks, for example NP CC NP → NP
This is something like the idea of cascaded finite-state chunking, as described by Abney (1995).
Example
- The girl with the telescope shouted at the boy who saw the dog in the field.
The current chunk-based transfer would normally chunk this into:
[The girl] [with] [the telescope] [shouted] [at] [the boy] [who] [saw] [the dog] [in] [the field] NP PREP NP V PREP NP REL V NP PREP NP
This is quite a shallow analysis, with more stages of chunking, we could unify some of those chunks into more coherent phrases. So for example the next stage might be to unify PREP NP → PP
then NP PP → NP
, then V NP → VP
and then NP REL VP → NP
. We'd end up with a more coherent and "deep" analysis which might look something like
The girl with the telescope shouted at the boy who saw the dog in the field DET NOM PREP DET NOM V PREP DET NOM REL V DET NOM PREP DET NOM * NP PREP NP V PREP NP REL V NP PREP NP (PREP NP → PP) NP PP V PP REL V NP PP (NP PP → NP) NP V PP REL V NP PP (V NP → VP) NP V PP REL VP (NP REL VP → NP) NP V NP
This would not give us any more "transfer power", as the rules would still be finite-state, and non-recursive, but it would make certain tasks easier.
References
- Steven Abney. (1996) "Partial Parsing via Finite-State Cascades". J. of Natural Language Engineering, 2(4): 337-344.