Omorfi

From Apertium
Revision as of 16:48, 31 March 2009 by Unhammer (talk | contribs) (→‎External links: wow HFST looks cool)
Jump to navigation Jump to search

OMorFi (Open Morphology of Finnish) is a computational morphology of Finnish written using SFST (or rather the Helsinki HFST variant).

Requirements

You will need SFST installed, you can follow the instructions on the SFST page.

Download

You need to have both the morphology files (OMorFi) and the wordlist (Kotus sanalista). The SVN version of kotus-sanalista can be downloaded from here, but requires Java and Saxon to compile the list, so a pre-compiled version will be used here.

$ svn co http://svn.gna.org/svn/omorfi/trunk omorfi
$ cd omorfi/src
$ wget http://xixona.dlsi.ua.es/~fran/wordlists/kotus-sanalista-1a.xml
$ wget http://xixona.dlsi.ua.es/~fran/wordlists/kotus-sanalista.sfstlex

Edit the omorfi/configure.ac file and comment out the line AC_CONFIG_AUX_DIR([config-aux]). Then edit the file omorfi/src/Makefile.am and comment out the line KOTUS_LEX = kotus-sanalista.sfstlex (or make will overwrite the ones you just downloaded).

$ aclocal
$ automake -a
$ autoconf
$ ./configure --with-kotus-sanalista=kotus-sanalista-1a.xml

Compilation

$ make

This could take 10--20 minutes.

Usage

After compiling, you can test it with the fst-proc program that comes with the apertium SFST distribution:

$ echo "kaikki ihmiset syntyvät vapaina ja tasavertaisina arvoltaan ja oikeuksiltaan." | fst-proc omorfi/src/omorfi.sfstc

^kaikki/kaikki<noun><7><a><sg><nom>$ ^ihmiset/ihminen<noun><38><pl><acc>/ihminen<noun><38><pl><nom>$ 
^syntyvät/syntyä<verb><52><j><act><pcpva><pl><acc>/syntyä<verb><52><j><act><pcpva><pl><nom>/syntyä<verb><52><j><act><indv><pres><pl3>$ 
^vapaina/vapaa<noun><17><pl><ess>$ ^ja/*ja$ ^tasavertaisina/*tasavertaisina$ ^arvoltaan/arvo<noun><1><sg><abl><pl3>/arvo<noun><1><sg><abl><sg3>$ ^ja/*ja$ 
^oikeuksiltaan/oikeus<noun><40><pl><abl><pl3>/oikeus<noun><40><pl><abl><sg3>$.

External links