Multi-engine translation synthesiser

From Apertium
Revision as of 12:33, 29 March 2009 by Francis Tyers (talk | contribs) (New page: The idea of this project is to take advantage of all possible resources in creating MT systems for marginalised languages. The general idea is to use the output of various MT systems to pr...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

The idea of this project is to take advantage of all possible resources in creating MT systems for marginalised languages. The general idea is to use the output of various MT systems to produce one "better" translation. The "baseline" would be to use Apertium and Moses.

Ideas

Statistical post-edition

This idea is kind of like the TMX support in Apertium, only it goes at the end of the pipeline.

  • The first approximation would be to
    • Take a parallel corpus, for e.g. Welsh--English, then run the Welsh side through Apertium to get English(MT)--English phrase table.
    • Make a program that goes at the end of the pipeline that for n-gram phrases looks them up in the phrase table.
    • If it finds a matching phrase, it scores both on a language model and chooses the highest probability.

Issues: Speed — language models are slow.