Hunmorph
Revision as of 17:36, 31 March 2008 by Francis Tyers (talk | contribs)
hunmorph is an set of programs for making morphological analysers and generators.
Requirements
You will need:
- ocaml
- ocaml-libs
Compiling
cd ocamorph ./build.sh build cd src/lib make cd ../bindings/c make cd ../../wrappers/ocamorph make
If you get the error, /usr/bin/ld: cannot find -lunix
, then check the Makefile and the include -I
paths, probably they don't point to the right place. On Debian I had to change the /usr/lib/ocaml/3.09.1
for /usr/lib/ocaml/3.10.1
. After you've compiled this you should have an ocamorph binary. Now go back to the root of your CVS tree.
lexicons/morphdb.hu/
Performance
For a 10,000 line test file,
$ time cat /tmp/test | ./ocamorph --aff ~/source/morphdb.hu/morphdb_hu.aff --dic ~/source/morphdb.hu/morphdb_hu.dic > /dev/null real 0m47.224s user 0m41.859s sys 0m0.620s
Compile the lexicon using:
$ ./ocamorph --aff ~/source/morphdb.hu/morphdb_hu.aff --dic ~/source/morphdb.hu/morphdb_hu.dic --bin hu.morph.bin
Then re-test:
$ time cat /tmp/test | ./ocamorph --bin hu.morph.bin > /dev/null real 0m15.023s user 0m14.625s sys 0m0.344s
Final size of the compiled binary is 22Mb.