N-Stage transfer

From Apertium
Revision as of 09:27, 23 March 2009 by Francis Tyers (talk | contribs)
Jump to navigation Jump to search

The idea of n-Stage transfer is to extend the apertium-interchunk so that it can output the same format as it inputs which would allow it to be called more than once, it would also be good to be able to "merge" chunks, for example NP CC NP → NP

This is something like the idea of cascaded finite-state chunking, as described by Abney (1995).

Example

The girl with the telescope shouted at the boy who saw the dog in the field.

The current chunk-based transfer would normally chunk this into:

[The girl] [with] [the telescope] [shouted] [at] [the boy] [who] [saw] [the dog] [in] [the field]
 NP         PREP   NP              V        PREP  NP        REL   V     NP       PREP  NP 

This is quite a shallow analysis, with more stages of chunking, we could unify some of those chunks into more coherent phrases. So for example the next stage might be to unify PREP NP → PP then NP PP → NP, then V NP → VP and then NP REL VP → NP. We'd end up with a more coherent and "deep" analysis which might look something like

 The girl  with   the telescope    shouted  at    the boy   who   saw   the dog  in    the field
 DET NOM   PREP   DET NOM          V        PREP  DET NOM   REL   V     DET NOM  PREP  DET NOM       *
 NP        PREP   NP               V        PREP  NP        REL   V     NP       PREP  NP            (PREP NP → PP)
 NP        PP                      V        PP              REL   V     NP       PP                  (NP PP → NP)
 NP                                V        PP              REL   V     NP       PP                  (V NP → VP)
 NP                                V        PP              REL   VP                                 (NP REL VP → NP)
 NP                                V        NP

This would not give us any more "transfer power", as the rules would still be finite-state, and non-recursive, but it would make certain tasks easier.

References