Difference between revisions of "Omorfi"

From Apertium
Jump to navigation Jump to search
Line 49: Line 49:
==See also==
==See also==


[[hfst]]
* [[hfst]]
* [[foma]]


==External links==
==External links==

Revision as of 08:58, 3 December 2009

OMorFi (Open Morphology of Finnish) is a computational morphology of Finnish written using SFST (or rather the Helsinki HFST variant).

Requirements

You will need SFST installed, you can follow the instructions on the SFST page.

Download

You need to have both the morphology files (OMorFi) and the wordlist (Kotus sanalista). The SVN version of kotus-sanalista can be downloaded from here, but requires Java and Saxon to compile the list, so a pre-compiled version will be used here.

$ svn co http://svn.gna.org/svn/omorfi/trunk omorfi
$ cd omorfi/src
$ wget http://xixona.dlsi.ua.es/~fran/wordlists/kotus-sanalista-1a.xml
$ wget http://xixona.dlsi.ua.es/~fran/wordlists/kotus-sanalista.sfstlex

Edit the omorfi/configure.ac file and comment out the line AC_CONFIG_AUX_DIR([config-aux]). Then edit the file omorfi/src/Makefile.am and comment out the line KOTUS_LEX = kotus-sanalista.sfstlex (or make will overwrite the ones you just downloaded).

$ aclocal
$ automake -a
$ autoconf
$ ./configure --with-kotus-sanalista=kotus-sanalista-1a.xml

Compilation

$ make

This could take 10--20 minutes.

Usage

After compiling, you can test it with the fst-proc program that comes with the apertium SFST distribution:

$ echo "kaikki ihmiset syntyvät vapaina ja tasavertaisina arvoltaan ja oikeuksiltaan." | fst-proc omorfi/src/omorfi.sfstc

^kaikki/kaikki<noun><7><a><sg><nom>$ ^ihmiset/ihminen<noun><38><pl><acc>/ihminen<noun><38><pl><nom>$ 
^syntyvät/syntyä<verb><52><j><act><pcpva><pl><acc>/syntyä<verb><52><j><act><pcpva><pl><nom>/syntyä<verb><52><j><act><indv><pres><pl3>$ 
^vapaina/vapaa<noun><17><pl><ess>$ ^ja/*ja$ ^tasavertaisina/*tasavertaisina$ ^arvoltaan/arvo<noun><1><sg><abl><pl3>/arvo<noun><1><sg><abl><sg3>$ ^ja/*ja$ 
^oikeuksiltaan/oikeus<noun><40><pl><abl><pl3>/oikeus<noun><40><pl><abl><sg3>$.

See also

External links