Difference between revisions of "User:Snippyhollow"
Jump to navigation
Jump to search
Snippyhollow (talk | contribs) |
Snippyhollow (talk | contribs) |
||
Line 11: | Line 11: | ||
python replace.py lowercase.en lowercase.cy |
python replace.py lowercase.en lowercase.cy |
||
then use this "lowercase.rep.en" and "lowercase.rep.cy" for the following of the phrase-table building (see below, Workflow for |
then use this "lowercase.rep.en" and "lowercase.rep.cy" for the following of the phrase-table building (see below, Workflow for building a phrase-table) |
||
sh split_prune.sh phrase-table |
sh split_prune.sh phrase-table |
||
while having set the "exe" inside the script to the good path to your compiled pruning.cc (g++ -O3 apertium-combine/pruning/pruning.cc) |
while having set the "exe" inside the script to the good path to your compiled pruning.cc (g++ -O3 apertium-combine/pruning/pruning.cc) |
||
===Workflow for |
===Workflow for building a phrase-table=== |
||
<pre> |
<pre> |
Latest revision as of 16:46, 26 May 2009
I'm a Computer Science student from France. This year, I did both an engineering degree (ENSIMAG) and a Master of C.S. spec. in Artificial Intelligence and Web. I am currently doing my Master's research internship at National Institute of Informatics in Tokyo on Inductive Logic Programming applied to biology systems.
Basic pruning[edit]
Needed files:
- replace.py (in apertium-combine/scripts/)
- split_prune.sh (in apertium-combine/scripts/)
- pruning.cc (in apertium-combine/pruning/)
- your lowercased corpus (see here)
python replace.py lowercase.en lowercase.cy
then use this "lowercase.rep.en" and "lowercase.rep.cy" for the following of the phrase-table building (see below, Workflow for building a phrase-table)
sh split_prune.sh phrase-table
while having set the "exe" inside the script to the good path to your compiled pruning.cc (g++ -O3 apertium-combine/pruning/pruning.cc)
Workflow for building a phrase-table[edit]
snippy:moses snippy$ python replace.py work/corpus/30k.lowercased.en work/corpus/30k.lowercased.cy snippy:moses snippy$ build-lm.sh -i work/corpus/30k.lowercased.rep.en -n 3 -o work/lm/30k-en.ilm.gz snippy:moses snippy$ compile-lm work/lm/30k-en.ilm.gz --text yes work/lm/30k-en.lm snippy:moses snippy$ rm work/model/* snippy:moses snippy$ nohup nice tools/moses-scripts/scripts-20090409-0149/training/train-factored-phrase-model.perl \ -scripts-root-dir tools/moses-scripts/scripts-20090409-0149/ -root-dir work -corpus work/corpus/30k.lowercased.rep -f cy \ -e en -alignment grow-diag-final-and -reordering msd-bidirectional-fe -lm 0:3:/Users/snippy/moses/work/lm/30k-en.lm >& work/training.out & ... ... snippy:moses snippy$ rm -rf work/tuning/mert/filtered/ snippy:moses snippy$ nohup nice tools/moses-scripts/scripts-20090409-0149/training/mert-moses.pl work/tuning/100.lowercased.cy \ work/tuning/100.lowercased.en tools/moses/moses-cmd/src/moses work/model/moses.ini --working-dir work/tuning/mert \ --rootdir /Users/snippy/moses/tools/moses-scripts/scripts-20090409-0149/ --decoder-flags "-v 0" >& work/tuning/mert.out &