Omorfi
OMorFi (Open Morphology of Finnish) is a computational morphology of Finnish written using SFST (or rather the Helsinki HFST variant).
Requirements
You will need SFST installed, you can follow the instructions on the SFST page.
Download
You need to have both the morphology files (OMorFi) and the wordlist (Kotus sanalista). The SVN version of kotus-sanalista
can be downloaded from here, but requires Java and Saxon to compile the list, so a pre-compiled version will be used here.
$ svn co http://svn.gna.org/svn/omorfi/trunk omorfi $ cd omorfi/src $ wget http://xixona.dlsi.ua.es/~fran/wordlists/kotus-sanalista-1a.xml $ wget http://xixona.dlsi.ua.es/~fran/wordlists/kotus-sanalista.sfstlex
Edit the omorfi/configure.ac
file and comment out the line AC_CONFIG_AUX_DIR([config-aux])
. Then edit the file omorfi/src/Makefile.am
and comment out the line KOTUS_LEX = kotus-sanalista.sfstlex
(or make will overwrite the ones you just downloaded).
$ aclocal $ automake -a $ autoconf $ ./configure --with-kotus-sanalista=kotus-sanalista-1a.xml
Compilation
$ make
This could take 10--20 minutes.
Usage
After compiling, you can test it with the fst-proc
program that comes with the apertium SFST distribution:
$ echo "kaikki ihmiset syntyvät vapaina ja tasavertaisina arvoltaan ja oikeuksiltaan." | fst-proc omorfi/src/omorfi.sfstc ^kaikki/kaikki<noun><7><a><sg><nom>$ ^ihmiset/ihminen<noun><38><pl><acc>/ihminen<noun><38><pl><nom>$ ^syntyvät/syntyä<verb><52><j><act><pcpva><pl><acc>/syntyä<verb><52><j><act><pcpva><pl><nom>/syntyä<verb><52><j><act><indv><pres><pl3>$ ^vapaina/vapaa<noun><17><pl><ess>$ ^ja/*ja$ ^tasavertaisina/*tasavertaisina$ ^arvoltaan/arvo<noun><1><sg><abl><pl3>/arvo<noun><1><sg><abl><sg3>$ ^ja/*ja$ ^oikeuksiltaan/oikeus<noun><40><pl><abl><pl3>/oikeus<noun><40><pl><abl><sg3>$.
External links
- OMorFi: Installation
- Gna!: Omorfi
- Overview of the HFST project (pdf), esp. in relation to other FST technology