Difference between revisions of "Moses"

From Apertium
Jump to navigation Jump to search
(New page: ==Requisites== * GIZA++ and mkcls http://giza-pp.googlecode.com/files/giza-pp-v1.0.2.tar.gz * Moses (<code>svn co https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk</code...)
 
 
(20 intermediate revisions by 3 users not shown)
Line 1: Line 1:
[[L'outil Moses|En français]]
==Requisites==


{{TOCD}}
* GIZA++ and mkcls http://giza-pp.googlecode.com/files/giza-pp-v1.0.2.tar.gz

* Moses (<code>svn co https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk</code>)
==Prerequisites==
* IRST LM (<code> svn co https://irstlm.svn.sourceforge.net/svnroot/irstlm</code>)
* [[GIZA++]], see the page for how to compile that. Moses also supports [[mgiza]] as an alternative to Giza.
* [[IRSTLM]], see the page for how to compile that, and how to make a language model.


==Compiling==
==Compiling==
Do

<pre>
<pre>
git clone https://github.com/moses-smt/mosesdecoder
tar -xzvf giza-pp-v1.0.2.tar.gz
cd mosesdecoder/
cd giza-pp
./bjam
make
cp mkcls-v2/mkcls /path/prefix/bin
cp GIZA++-v2/GIZA++ /path/prefix/bin
cp GIZA++-v2/plain2snt.out /path/prefix/bin
cp GIZA++-v2/snt2cooc.out /path/prefix/bin
cp GIZA++-v2/snt2plain.out /path/prefix/bin
cp GIZA++-v2/trainGIZA++.sh /path/prefix/bin
cd ..

cd trunk
./regenerate-makefiles.sh
./configure --prefix=/path/prefix
make
make install
</pre>
</pre>
The bjam part takes a long while.


==Troubleshooting==
Now edit the file <code>scripts/training/train-factored-phrase-model.perl</code> and change the following line:
If your logs anywhere say anything about UnicodeEncodeError, you might have to do

<pre>
<pre>
export PYTHONIOENCODING=utf-8
# the following line is set installation time by 'make release'. BEWARE!
my $BINDIR="/path/prefix/bin";
</pre>
</pre>
before running train-model.perl (or fix merge_alignments.py yourself)


==See also==
<pre>
cp scripts/training/train-factored-phrase-model.perl /path/prefix/bin/
cd ..


* [[Using GIZA++]]
cd irstlm
./install
cp bin/* /path/prefix/bin/
mkdir -p /path/prefix/include
cp include/* /path/prefix/include
cp lib/x86_64-pc-linux-gnu/libirstlm.a /path/prefix/lib/
cd ..


==External links==
cd /path/prefix/bin
ln -s snt2cooc snt2cooc.out
ln -s mgizapp GIZA++
</pre>


* [http://www.statmt.org/wmt08/baseline.html WMT08 Baseline system]


==Building language model==

<pre>
export IRSTLM=/path/prefix

</pre>


[[Category:Tools]]
[[Category:Tools]]
[[Category:Documentation in English]]

Latest revision as of 08:57, 29 April 2015

En français

Prerequisites[edit]

  • GIZA++, see the page for how to compile that. Moses also supports mgiza as an alternative to Giza.
  • IRSTLM, see the page for how to compile that, and how to make a language model.

Compiling[edit]

Do

git clone https://github.com/moses-smt/mosesdecoder
cd mosesdecoder/
./bjam 

The bjam part takes a long while.

Troubleshooting[edit]

If your logs anywhere say anything about UnicodeEncodeError, you might have to do

export PYTHONIOENCODING=utf-8

before running train-model.perl (or fix merge_alignments.py yourself)

See also[edit]

External links[edit]