Difference between revisions of "Moses"
Jump to navigation
Jump to search
Line 30: | Line 30: | ||
==Troubleshooting== |
==Troubleshooting== |
||
If your logs anywhere say anything about UnicodeEncodeError, you might have to do |
|||
<pre> |
<pre> |
||
do |
|||
export PYTHONIOENCODING=utf-8 |
export PYTHONIOENCODING=utf-8 |
||
⚫ | |||
</pre> |
</pre> |
||
⚫ | |||
==See also== |
==See also== |
Revision as of 08:53, 29 April 2015
Requisites
- GIZA++ and mkcls (
git clone https://github.com/moses-smt/giza-pp
) - Moses (
git clone git@github.com:moses-smt/mosesdecoder.git
) - IRST LM (
svn checkout svn://svn.code.sf.net/p/irstlm/code/trunk irstlm
)
Compiling
See Using GIZA++ for how to compile that. Moses also supports mgiza as an alternative to Giza.
See IRSTLM for how to compile that.
Do
git clone https://github.com/moses-smt/mosesdecoder cd mosesdecoder/ ./bjam
The bjam part takes a long while.
Building language model
export IRSTLM=/path/prefix build-lm.sh -i cy.crp.txt -o cy.lm.gz -t /tmp
Troubleshooting
If your logs anywhere say anything about UnicodeEncodeError, you might have to do
export PYTHONIOENCODING=utf-8
before running train-model.perl (or fix merge_alignments.py yourself)