Difference between revisions of "Moses"

Latest revision as of 08:57, 29 April 2015

Prerequisites[edit]

GIZA++, see the page for how to compile that. Moses also supports mgiza as an alternative to Giza.
IRSTLM, see the page for how to compile that, and how to make a language model.

Compiling[edit]

Do

git clone https://github.com/moses-smt/mosesdecoder
cd mosesdecoder/
./bjam

The bjam part takes a long while.

Troubleshooting[edit]

If your logs anywhere say anything about UnicodeEncodeError, you might have to do

export PYTHONIOENCODING=utf-8

before running train-model.perl (or fix merge_alignments.py yourself)

External links[edit]

WMT08 Baseline system

@@ Line 1: / Line 1: @@
+[[L'outil Moses|En français]]
 {{TOCD}}
-==Requisites==
+==Prerequisites==
+* [[GIZA++]], see the page for how to compile that. Moses also supports [[mgiza]] as an alternative to Giza.
+* [[IRSTLM]], see the page for how to compile that, and how to make a language model.
-* GIZA++ and mkcls http://giza-pp.googlecode.com/files/giza-pp-v1.0.2.tar.gz
-* Moses (<code>svn co https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk</code>)
-* IRST LM (<code> svn co https://irstlm.svn.sourceforge.net/svnroot/irstlm</code>)
 ==Compiling==
+Do
-{{see-also|Using GIZA++}}
 <pre>
+git clone https://github.com/moses-smt/mosesdecoder
-tar -xzvf giza-pp-v1.0.2.tar.gz
+cd mosesdecoder/
-cd giza-pp
+./bjam
-make
-cp mkcls-v2/mkcls /path/prefix/bin
-cp GIZA++-v2/GIZA++ /path/prefix/bin
-cp GIZA++-v2/plain2snt.out /path/prefix/bin
-cp GIZA++-v2/snt2cooc.out /path/prefix/bin
-cp GIZA++-v2/snt2plain.out /path/prefix/bin
-cp GIZA++-v2/trainGIZA++.sh /path/prefix/bin
-cd ..
-cd trunk
-./regenerate-makefiles.sh
-./configure --prefix=/path/prefix
-make
-make install
-cd scripts/training/symal
-make
-cd ../../../
-cd scripts/training/phrase-extract
-make
-cd ../../../
 </pre>
+The bjam part takes a long while.
+==Troubleshooting==
-Now edit the file <code>scripts/training/train-factored-phrase-model.perl</code> and change the following lines:
+If your logs anywhere say anything about UnicodeEncodeError, you might have to do
 <pre>
+export PYTHONIOENCODING=utf-8
-my $SCRIPTS_ROOTDIR = "/home/fran/source/moses/trunk/scripts/";
-...
-# the following line is set installation time by 'make release'.  BEWARE!
-my $BINDIR="/path/prefix/bin";
 </pre>
+before running train-model.perl (or fix merge_alignments.py yourself)
+==See also==
-<pre>
-cp scripts/training/train-factored-phrase-model.perl /path/prefix/bin/
-cp scripts/training/symal/giza2bal.pl /path/prefix/bin/
+* [[Using GIZA++]]
-cd ..
+==External links==
-cd irstlm
-./install
-</pre>
+* [http://www.statmt.org/wmt08/baseline.html WMT08 Baseline system]
-Now edit the files in <code>scripts/build-sublm.pl</code> and <code>scripts/merge-sublm.pl</code> and check the location of gzip,
-<pre>
-my $gzip="/usr/bin/gzip";
-my $gunzip="/usr/bin/gunzip";
-</pre>
-On Debian systems, <code>gzip</code> and <code>gunzip</code> are found in <code>/bin</code>, these two scripts will fail silently if gzip is not found.
-<pre>
-cp bin/* /path/prefix/bin/
-cp bin/x86_64-pc-linux-gnu/* /path/prefix/bin/
-mkdir -p /path/prefix/include
-cp include/* /path/prefix/include
-cp lib/x86_64-pc-linux-gnu/libirstlm.a /path/prefix/lib/
-cd ..
-</pre>
-==Building language model==
-<pre>
-export IRSTLM=/path/prefix
-build-lm.sh -i cy.crp.txt -o cy.lm.gz -t /tmp
-</pre>
-==See also==
-* [[Using GIZA++]]
 [[Category:Tools]]
+[[Category:Documentation in English]]

Difference between revisions of "Moses"

Latest revision as of 08:57, 29 April 2015

Contents

Prerequisites[edit]

Compiling[edit]

Troubleshooting[edit]

See also[edit]

External links[edit]

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools