IRSTLM
IRSTLM is a free and open source exact statistical language model using memory-mapping. The language models are compatible with those created with the closed-source SRILM Tooolkit.
See the homepage at http://hlt.fbk.eu/en/irstlm
Installation
svn checkout svn://svn.code.sf.net/p/irstlm/code/trunk irstlm cd irstlm cmake -G "Unix Makefiles" -DCMAKE_INSTALL_PREFIX=/path/prefix make -j4 make install
Make a language model
export IRSTLM=/path/prefix $IRSTLM/bin/build-lm.sh -i incorpus.txt -o out.lm.gz -t tmp/
See also
- Moses
- Using GIZA++
- RandLM - a randomised LM, based on Bloom Filters