Difference between revisions of "Moses"

Latest revision as of 08:57, 29 April 2015

GIZA++, see the page for how to compile that. Moses also supports mgiza as an alternative to Giza.
IRSTLM, see the page for how to compile that, and how to make a language model.

Do

git clone https://github.com/moses-smt/mosesdecoder
cd mosesdecoder/
./bjam

The bjam part takes a long while.

If your logs anywhere say anything about UnicodeEncodeError, you might have to do

export PYTHONIOENCODING=utf-8

before running train-model.perl (or fix merge_alignments.py yourself)

@@ Line 3: / Line 3: @@
 {{TOCD}}
-==Requisites==
+==Prerequisites==
+* [[GIZA++]], see the page for how to compile that. Moses also supports [[mgiza]] as an alternative to Giza.
+* [[IRSTLM]], see the page for how to compile that, and how to make a language model.
-* GIZA++ and mkcls (<code>git clone https://github.com/moses-smt/giza-pp</code>)
-* Moses (<code>git clone git@github.com:moses-smt/mosesdecoder.git</code>)
-* IRST LM (<code>svn checkout svn://svn.code.sf.net/p/irstlm/code/trunk irstlm</code>)
 ==Compiling==
-See [[Using GIZA++]] for how to compile that. Moses also supports [[mgiza]] as an alternative to Giza.
-See [[IRSTLM]] for how to compile that.
 Do
 <pre>
@@ Line 21: / Line 15: @@
 </pre>
 The bjam part takes a long while.
-==Building language model==
-<pre>
-export IRSTLM=/path/prefix
-build-lm.sh -i cy.crp.txt -o cy.lm.gz -t /tmp
-</pre>
 ==Troubleshooting==
+If your logs anywhere say anything about UnicodeEncodeError, you might have to do
 <pre>
-do
 export PYTHONIOENCODING=utf-8
+</pre>
 before running train-model.perl (or fix merge_alignments.py yourself)
-</pre>
 ==See also==