Difference between revisions of "Moses"
Jump to navigation
Jump to search
Line 10: | Line 10: | ||
==Compiling== |
==Compiling== |
||
See [[Using GIZA++]] for how to compile that. Moses also supports [[mgiza]] as an alternative to Giza. |
|||
{{see-also|Using GIZA++}} |
|||
See [[IRSTLM]] for how to compile that. |
|||
;GIZA++ |
|||
Do |
|||
<pre> |
<pre> |
||
git clone https://github.com/moses-smt/mosesdecoder |
|||
tar -xzvf giza-pp-v1.0.2.tar.gz |
|||
cd mosesdecoder/ |
|||
cd giza-pp |
|||
./bjam |
|||
make |
|||
cp mkcls-v2/mkcls /path/prefix/bin |
|||
cp GIZA++-v2/GIZA++ /path/prefix/bin |
|||
cp GIZA++-v2/plain2snt.out /path/prefix/bin |
|||
cp GIZA++-v2/snt2cooc.out /path/prefix/bin |
|||
cp GIZA++-v2/snt2plain.out /path/prefix/bin |
|||
cp GIZA++-v2/trainGIZA++.sh /path/prefix/bin |
|||
cd .. |
|||
</pre> |
|||
;Moses |
|||
<pre> |
|||
cd trunk |
|||
./regenerate-makefiles.sh |
|||
./configure --prefix=/path/prefix |
|||
make |
|||
make install |
|||
cd scripts/training/symal |
|||
make |
|||
cp symal giza2bal.pl /path/prefix/bin |
|||
cd ../../../ |
|||
cd scripts/training/phrase-extract |
|||
make |
|||
cp extract score /path/prefix/bin |
|||
cd ../../../ |
|||
</pre> |
|||
Now edit the file <code>scripts/training/train-factored-phrase-model.perl</code> and change the following lines: |
|||
<pre> |
|||
my $SCRIPTS_ROOTDIR = "/home/fran/source/moses/trunk/scripts/"; |
|||
... |
|||
# the following line is set installation time by 'make release'. BEWARE! |
|||
my $BINDIR="/path/prefix/bin"; |
|||
</pre> |
|||
<pre> |
|||
cp scripts/training/train-factored-phrase-model.perl /path/prefix/bin/ |
|||
cp scripts/training/symal/giza2bal.pl /path/prefix/bin/ |
|||
cd .. |
|||
</pre> |
|||
;IRSTLM |
|||
<pre> |
|||
cd irstlm |
|||
cmake -G "Unix Makefiles" -DCMAKE_INSTALL_PREFIX=/path/prefix |
|||
make -j4 |
|||
make install |
|||
</pre> |
</pre> |
||
The bjam part takes a long while. |
|||
==Building language model== |
==Building language model== |
Revision as of 08:51, 29 April 2015
Requisites
- GIZA++ and mkcls (
git clone https://github.com/moses-smt/giza-pp
) - Moses (
git clone git@github.com:moses-smt/mosesdecoder.git
) - IRST LM (
svn checkout svn://svn.code.sf.net/p/irstlm/code/trunk irstlm
)
Compiling
See Using GIZA++ for how to compile that. Moses also supports mgiza as an alternative to Giza.
See IRSTLM for how to compile that.
Do
git clone https://github.com/moses-smt/mosesdecoder cd mosesdecoder/ ./bjam
The bjam part takes a long while.
Building language model
export IRSTLM=/path/prefix build-lm.sh -i cy.crp.txt -o cy.lm.gz -t /tmp
Troubleshooting
do export PYTHONIOENCODING=utf-8 before running train-model.perl (or fix merge_alignments.py yourself)