Difference between revisions of "Hunmorph"
Jump to navigation
Jump to search
Line 30: | Line 30: | ||
==Performance== |
==Performance== |
||
For a 10,000 line test file, |
For a 10,000 line test file, with a analyser with support for 4,000,000 word forms. |
||
<pre> |
<pre> |
||
Line 53: | Line 53: | ||
Final size of the compiled binary is 22Mb. |
Final size of the compiled binary is 22Mb. |
||
==External links== |
==External links== |
Revision as of 17:49, 31 March 2008
hunmorph is an set of programs for making morphological analysers and generators.
Requirements
You will need:
- ocaml
- ocaml-libs
Compiling
cd ocamorph ./build.sh build cd src/lib make cd ../bindings/c make cd ../../wrappers/ocamorph make
If you get the error, /usr/bin/ld: cannot find -lunix
, then check the Makefile and the include -I
paths, probably they don't point to the right place. On Debian I had to change the /usr/lib/ocaml/3.09.1
for /usr/lib/ocaml/3.10.1
. After you've compiled this you should have an ocamorph binary. Now go back to the root of your CVS tree.
lexicons/morphdb.hu/
Performance
For a 10,000 line test file, with a analyser with support for 4,000,000 word forms.
$ time cat /tmp/test | ./ocamorph --aff ~/source/morphdb.hu/morphdb_hu.aff --dic ~/source/morphdb.hu/morphdb_hu.dic > /dev/null real 0m47.224s user 0m41.859s sys 0m0.620s
Compile the lexicon using:
$ ./ocamorph --aff ~/source/morphdb.hu/morphdb_hu.aff --dic ~/source/morphdb.hu/morphdb_hu.dic --bin hu.morph.bin
Then re-test:
$ time cat /tmp/test | ./ocamorph --bin hu.morph.bin > /dev/null real 0m15.023s user 0m14.625s sys 0m0.344s
Final size of the compiled binary is 22Mb.