Hunmorph

From Apertium
Jump to navigation Jump to search

hunmorph is an set of programs for making morphological analysers and generators.

Requirements

You will need:

  • ocaml
  • ocaml-libs

Compiling

cd ocamorph
./build.sh build
cd src/lib
make
cd ../bindings/c
make
cd ../../wrappers/ocamorph
make

If you get the error, /usr/bin/ld: cannot find -lunix, then check the Makefile and the include -I paths, probably they don't point to the right place. On Debian I had to change the /usr/lib/ocaml/3.09.1 for /usr/lib/ocaml/3.10.1. After you've compiled this you should have an ocamorph binary. Now go back to the root of your CVS tree.

lexicons/morphdb.hu/

Performance

For a 10,000 line test file, with a analyser with support for 4,000,000 word forms.

$ time cat /tmp/test | ./ocamorph --aff ~/source/morphdb.hu/morphdb_hu.aff --dic ~/source/morphdb.hu/morphdb_hu.dic > /dev/null
real    0m47.224s
user    0m41.859s
sys     0m0.620s

Compile the lexicon using:

$ ./ocamorph --aff ~/source/morphdb.hu/morphdb_hu.aff --dic ~/source/morphdb.hu/morphdb_hu.dic --bin hu.morph.bin

Then re-test:

$ time cat /tmp/test |  ./ocamorph  --bin hu.morph.bin > /dev/null
real    0m15.023s
user    0m14.625s
sys     0m0.344s

Final size of the compiled binary is 22Mb.

External links