Difference between revisions of "Hunmorph"
Line 23: | Line 23: | ||
If you get the error, <code>/usr/bin/ld: cannot find -lunix</code>, then check the Makefile and the include <code>-I</code> paths, probably they don't point to the right place. On Debian I had to change the <code>/usr/lib/ocaml/3.09.1</code> for <code>/usr/lib/ocaml/3.10.1</code>. After you've compiled this you should have an ocamorph binary. Now go back to the root of your CVS tree. |
If you get the error, <code>/usr/bin/ld: cannot find -lunix</code>, then check the Makefile and the include <code>-I</code> paths, probably they don't point to the right place. On Debian I had to change the <code>/usr/lib/ocaml/3.09.1</code> for <code>/usr/lib/ocaml/3.10.1</code>. After you've compiled this you should have an ocamorph binary. Now go back to the root of your CVS tree. |
||
You can test ocamorph with the binary distribution available [http://ftp.mokk.bme.hu/Tool/Hunmorph/Resources/Morphdb.hu/morphdb-hu-20060525.tgz here]. The CVS distribution does not seem to build at the moment. If you untar the file in <code>~/source/</code> you should see: |
|||
<pre> |
<pre> |
||
$ ls ~/source/morphdb.hu/ |
|||
AUTHORS CVS doc LICENCE morphdb_hu.aff morphdb_hu.dic README |
|||
</pre> |
|||
You can then test it with: |
|||
<pre> |
|||
$ echo "programot" | ocamorph --aff ~/source/morphdb.hu/morphdb_hu.aff --dic ~/source/morphdb.hu/morphdb_hu.dic |
|||
> programot |
|||
program/NOUN<CAS<ACC>> |
|||
</pre> |
</pre> |
||
Revision as of 17:55, 31 March 2008
hunmorph is an set of programs for making morphological analysers and generators.
Requirements
You will need:
- ocaml
- ocaml-libs
Compiling
cd ocamorph ./build.sh build cd src/lib make cd ../bindings/c make cd ../../wrappers/ocamorph make
If you get the error, /usr/bin/ld: cannot find -lunix
, then check the Makefile and the include -I
paths, probably they don't point to the right place. On Debian I had to change the /usr/lib/ocaml/3.09.1
for /usr/lib/ocaml/3.10.1
. After you've compiled this you should have an ocamorph binary. Now go back to the root of your CVS tree.
You can test ocamorph with the binary distribution available here. The CVS distribution does not seem to build at the moment. If you untar the file in ~/source/
you should see:
$ ls ~/source/morphdb.hu/ AUTHORS CVS doc LICENCE morphdb_hu.aff morphdb_hu.dic README
You can then test it with:
$ echo "programot" | ocamorph --aff ~/source/morphdb.hu/morphdb_hu.aff --dic ~/source/morphdb.hu/morphdb_hu.dic > programot program/NOUN<CAS<ACC>>
Performance
For a 10,000 line test file, with a analyser with support for 4,000,000 word forms.
$ time cat /tmp/test | ./ocamorph --aff ~/source/morphdb.hu/morphdb_hu.aff --dic ~/source/morphdb.hu/morphdb_hu.dic > /dev/null real 0m47.224s user 0m41.859s sys 0m0.620s
Compile the lexicon using:
$ ./ocamorph --aff ~/source/morphdb.hu/morphdb_hu.aff --dic ~/source/morphdb.hu/morphdb_hu.dic --bin hu.morph.bin
Then re-test:
$ time cat /tmp/test | ./ocamorph --bin hu.morph.bin > /dev/null real 0m15.023s user 0m14.625s sys 0m0.344s
Final size of the compiled binary is 22Mb.