Revision as of 12:39, 29 March 2009

Ideas

This idea is kind of like the TMX support in Apertium, only it goes at the end of the pipeline.

Issues: Speed — language models and phrase tables are slow, but we can discard lots^[1]

Johnson, J.H., Martin, J., Foster, G., and Kuhn, R. (2007) "Improving Translation Quality by Discarding Most of the Phrasetable". Proceedings of EMNLP. 2007. NRC 49348.

↑ Johnson et al. (2007)

@@ Line 1: / Line 1: @@
+{{TOCD}}
 The idea of this project is to take advantage of all possible resources in creating MT systems for marginalised languages. The general idea is to use the output of various MT systems to produce one "better" translation. The "baseline" would be to use Apertium and Moses.
@@ Line 12: / Line 13: @@
 ** If it finds a matching phrase, it scores both on a language model and chooses the highest probability.
 ** This idea can be extended by incorporating user-feedback. For example a user "post-edits a phrase" and you can add these phrases to the phrase table at a given probability.
+** Could also help by resolving some unknown words.
-Issues: Speed &mdash; language models are slow.
+Issues: Speed &mdash; language models and phrase tables are slow, but we can discard lots<ref>Johnson et al. (2007)</ref>
+==Notes==
+==References=
+* Johnson, J.H., Martin, J., Foster, G., and Kuhn, R. (2007) "Improving Translation Quality by Discarding Most of the Phrasetable". ''Proceedings of EMNLP''. 2007. NRC 49348.