Difference between revisions of "Recursive transfer"

From Apertium
Jump to navigation Jump to search
Line 5: Line 5:
===Deliverable 1===
===Deliverable 1===


* Program which takes output of lt-proc -b (biltrans) and applies a grammar, doing only reordering, no tag changes
* Program which takes output of lt-proc -b (biltrans) and applies a grammar, doing only reordering (and "insertion"/"deletion"), no tag changes
** The input would be ^sl/tl$ and the output would be ^tl$
** The input would be ^sl/tl$ and the output would be ^tl$
** The grammar can be specified using a simple text-based CFG grammar formalism, converted into bison and compiled.
** The grammar can be specified using a simple text-based CFG grammar formalism, converted into bison and compiled.
Line 39: Line 39:
SV -> SN v { $2 $1 }
SV -> SN v { $2 $1 }
SN -> N3 art { $2 $1 } | N3 { $1 }
SN -> N3 art { $2 $1 } | N3 { $1 }
N3 -> SNGen N2 { $2 $1 } | N2 { $1 }
N3 -> SNGen N2 { $2 "^of<pr>$" $1 } | N2 { $1 }
N2 -> nom { $1 } | prn { $1 }
N2 -> nom { $1 } | prn { $1 }
SNGen -> SN genpost { $1 }
SNGen -> SN genpost { $1 }

Revision as of 17:13, 2 October 2013

Deliverables

Deliverable 1

  • Program which takes output of lt-proc -b (biltrans) and applies a grammar, doing only reordering (and "insertion"/"deletion"), no tag changes
    • The input would be ^sl/tl$ and the output would be ^tl$
    • The grammar can be specified using a simple text-based CFG grammar formalism, converted into bison and compiled.
Input
^Hau<prn><dem><sg>/This<prn><dem><sg>$ 
^irabazle<n>/winner<n><ND>$ 
^bat<num><sg>/a<det><ind><sg>$ 
^en<post>/of<pr>$ 
^historia<n>/story<n><ND>$ 
^a<det><art><sg>/the<det><def><sg>$ 
^izan<vbsint><pri><NR_HU>/be<vbser><pri><NR_HU>$
^.<sent>/.<sent>$
Output
^This<prn><dem><sg>$ 
^be<vbser><pri><NR_HU>$
^the<det><def><sg>$ 
^story<n><ND>$ 
^of<pr>$ 
^a<det><ind><sg>$ 
^winner<n><ND>$ 
^.<sent>$
Grammar
S -> SN SV sent { $1 $2 $3 }
SV -> SN v { $2 $1 }
SN -> N3 art { $2 $1 } | N3 { $1 } 
N3 -> SNGen N2 { $2 "^of<pr>$" $1 } | N2 { $1 } 
N2 -> nom { $1 } | prn { $1 } 
SNGen -> SN genpost { $1 }
sent -> "sent" { $1 } 
v -> "vbser.*" { $1 } | "vblex.*" { $1 } 
art -> "det.art.*" { $1 } | "num.sg" { $1 } 
nom -> "n" { $1 } 
prn -> "prn.*" { $1 } 

Deliverable 2

  • An XML format for the rules, based on the current format, taking into account transfer operations

Questions

  • What to do with a parse-fail.
  • Ambiguous grammars -> can be automatically disambiguated ?
    • Learn shift/reduce using target-language information ?
  • Converting right-recursive to left-recursive grammars.
  • How to apply macros in rules which have >1 non-terminal.

Algorithms

References

  • Prószéky & Tihanyi (2002) "MetaMorpho: A Pattern-Based Machine Translation System"
  • White (1985) "Characteristics of the METAL machine translation system at Production Stage" (§6)
  • Slocum (1982) "The LRC Machine translation system: An application of State-of-the-Art ..." (p.18)

See also

External links