Difference between revisions of "Hunmorph"

From Apertium
Jump to navigation Jump to search
Line 22: Line 22:
</pre>
</pre>


If you get the error, <code>/usr/bin/ld: cannot find -lunix</code>, then check the Makefile and the include <code>-I</code> paths, probably they don't point to the right place. After you've compiled this you should have an ocamorph binary. Now go back to the root of your CVS tree.
If you get the error, <code>/usr/bin/ld: cannot find -lunix</code>, then check the Makefile and the include <code>-I</code> paths, probably they don't point to the right place. On Debian I had to change the <code>/usr/lib/ocaml/3.09.1</code> for <code>/usr/lib/ocaml/3.10.1</code>. After you've compiled this you should have an ocamorph binary. Now go back to the root of your CVS tree.


<pre>
<pre>
lexicons/morphdb.hu/
lexicons/morphdb.hu/
</pre>

==Performance==

For a 10,000 line test file,

<pre>
$ time cat /tmp/test | ./ocamorph --aff ~/source/morphdb.hu/morphdb_hu.aff --dic ~/source/morphdb.hu/morphdb_hu.dic > /dev/null
real 0m47.224s
user 0m41.859s
sys 0m0.620s
</pre>

Compile the lexicon using:
<pre>
$ ./ocamorph --aff ~/source/morphdb.hu/morphdb_hu.aff --dic ~/source/morphdb.hu/morphdb_hu.dic --bin hu.morph.bin
</pre>

Then re-test:
<pre>
$ time cat /tmp/test | ./ocamorph --bin hu.morph.bin > /dev/null
real 0m15.023s
user 0m14.625s
sys 0m0.344s
</pre>

Final size of the compiled binary is 22Mb.


==Performance==
==Performance==

Revision as of 17:36, 31 March 2008

hunmorph is an set of programs for making morphological analysers and generators.

Requirements

You will need:

  • ocaml
  • ocaml-libs

Compiling

cd ocamorph
./build.sh build
cd src/lib
make
cd ../bindings/c
make
cd ../../wrappers/ocamorph
make

If you get the error, /usr/bin/ld: cannot find -lunix, then check the Makefile and the include -I paths, probably they don't point to the right place. On Debian I had to change the /usr/lib/ocaml/3.09.1 for /usr/lib/ocaml/3.10.1. After you've compiled this you should have an ocamorph binary. Now go back to the root of your CVS tree.

lexicons/morphdb.hu/

Performance

For a 10,000 line test file,

$ time cat /tmp/test | ./ocamorph --aff ~/source/morphdb.hu/morphdb_hu.aff --dic ~/source/morphdb.hu/morphdb_hu.dic > /dev/null
real    0m47.224s
user    0m41.859s
sys     0m0.620s

Compile the lexicon using:

$ ./ocamorph --aff ~/source/morphdb.hu/morphdb_hu.aff --dic ~/source/morphdb.hu/morphdb_hu.dic --bin hu.morph.bin

Then re-test:

$ time cat /tmp/test |  ./ocamorph  --bin hu.morph.bin > /dev/null
real    0m15.023s
user    0m14.625s
sys     0m0.344s

Final size of the compiled binary is 22Mb.

Performance

$ wc -l /tmp/test 
10000 /tmp/test

$ time cat /tmp/test | ./ocamorph --aff ~/source/morphdb.hu/morphdb_hu.aff --dic ~/source/morphdb.hu/morphdb_hu.dic > /dev/null
real    0m47.224s
user    0m41.859s
sys     0m0.620s


$ ./ocamorph --aff ~/source/morphdb.hu/morphdb_hu.aff --dic ~/source/morphdb.hu/morphdb_hu.dic --bin hu.morph.bin

$


External links