Difference between revisions of "Omorfi"
Jump to navigation
Jump to search
(→Usage) |
TommiPirinen (talk | contribs) |
||
(5 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
{{TOCD}} |
{{TOCD}} |
||
− | ''' |
+ | '''Omorfi''' (Open Morphology of Finnish) is a computational morphology of Finnish written using [[HFST]]. |
==Requirements== |
==Requirements== |
||
Line 11: | Line 11: | ||
<pre> |
<pre> |
||
− | $ |
+ | $ git clone https://github.com/flammie/omorfi |
$ cd omorfi/ |
$ cd omorfi/ |
||
+ | $ ./autogen.sh |
||
− | $ autoreconf -i |
||
− | $ ./configure |
+ | $ ./configure |
− | $ cd src/ |
||
</pre> |
</pre> |
||
+ | |||
+ | In case autogen.sh does not work, do report a bug (autoreconf -i should work just as well in the meantime). |
||
==Compilation== |
==Compilation== |
||
Line 26: | Line 27: | ||
</pre> |
</pre> |
||
− | This will compile everything. |
+ | This will compile everything. |
+ | To prepare source code for new apertium language pair, use src/scripts/omor2apertium.sh... or just copy one from an existing pair, such as apertium-fin-eng. |
||
− | <pre> |
||
− | $ make mor-omorfi.hwfst |
||
− | </pre> |
||
− | |||
− | This could take 10--30 minutes. |
||
==Usage== |
==Usage== |
||
After compiling, you can test it with the <code>hfst-lookup</code> program. |
After compiling, you can test it with the <code>hfst-lookup</code> program. |
||
− | |||
− | <pre> |
||
− | |||
− | $ echo "kaikki ihmiset syntyvät vapaina ja tasavertaisina arvoltaan ja oikeuksiltaan ." |\ |
||
− | sed 's/ /\n/g' | hfst-lookup src/mor-omorfi.hwfst |
||
− | |||
− | kaikki [##]kaikki[POS=PRONOUN][NUM=SG][CASE=NOM][##] |
||
− | |||
− | ihmiset [##]ihminen[POS=NOUN][KTN=38][NUM=PL][CASE=NOM,ACC][##] |
||
− | |||
− | syntyvät [##]syntyä[POS=VERB][KTN=52][KAV=J][GEN=ACT][MOOD=INDV][TENSE=PRES][PRS=PL3][##] |
||
− | syntyvät [##]syntyä[POS=VERB][KTN=52][KAV=J][GEN=ACT][PCP=VA][CMP=POS][NUM=PL][CASE=NOM,ACC][##] |
||
− | |||
− | vapaina [##]vapaa[POS=ADJECTIVE][KTN=17][CMP=POS][NUM=PL][CASE=ESS][##] |
||
− | |||
− | ja [##]ja[POS=PARTICLE][##] |
||
− | ja [##]ja[POS=CONJUNCTION][##] |
||
− | |||
− | tasavertaisina [##]tasavertainen[POS=ADJECTIVE][KTN=38][CMP=POS][NUM=PL][CASE=ESS][##] |
||
− | tasavertaisina [##]tasa[POS=NOUN][KTN=9][NUM=SG][CASE=NOM][#][?]vertainen[POS=ADJECTIVE][KTN=38][CMP=POS] |
||
− | [NUM=PL][CASE=ESS][##] |
||
− | |||
− | arvoltaan [##]arvo[POS=NOUN][KTN=1][NUM=SG][CASE=ABL][POSS=SG3,PL3][##] |
||
− | |||
− | ja [##]ja[POS=PARTICLE][##] |
||
− | ja [##]ja[POS=CONJUNCTION][##] |
||
− | |||
− | oikeuksiltaan [##]oikeus[POS=NOUN][KTN=40][NUM=PL][CASE=ABL][POSS=SG3,PL3][##] |
||
− | |||
− | . [##].[POS=PUNCTUATION][##] |
||
− | |||
− | </pre> |
||
− | |||
− | To get the output to something approaching Apertium: |
||
− | |||
− | <pre> |
||
− | echo "kaikki ihmiset syntyvät vapaina ja tasavertaisina arvoltaan ja oikeuksiltaan." |\ |
||
− | sed 's/$/¶/g' | sed 's/\W/\n&\n/g' | grep -v '^ $' | hfst-lookup src/mor-omorfi.hwfst |\ |
||
− | python omorfi-to-apertium.py |
||
− | |||
− | ^kaikki/kaikki<PRONOUN><SG><NOM>$ ^ihmiset/ihminen<NOUN><38><PL><NOM,ACC>$ |
||
− | ^syntyvät/syntyä<VERB><52><J><ACT><INDV><PRES><PL3>/syntyä<VERB><52><J><ACT><VA><POS><PL><NOM,ACC>$ |
||
− | ^vapaina/vapaa<ADJECTIVE><17><POS><PL><ESS>$ ^ja/ja<PARTICLE>/ja<CONJUNCTION>$ |
||
− | ^tasavertaisina/tasavertainen<ADJECTIVE><38><POS><PL><ESS>/tasa<NOUN><9><SG><NOM>+vertainen<ADJECTIVE><38><POS><PL><ESS>$ |
||
− | ^arvoltaan/arvo<NOUN><1><SG><ABL><SG3,PL3>$ ^ja/ja<PARTICLE>/ja<CONJUNCTION>$ |
||
− | ^oikeuksiltaan/oikeus<NOUN><40><PL><ABL><SG3,PL3>$ ^./.<PUNCTUATION>$ |
||
− | |||
− | </pre> |
||
− | |||
− | The <code>omorfi-to-apertium.py</code> script can be found [https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/tools/omorfi-to-apertium.py here] and can also be run with the <code>-c</code> option to use the in-file tag conversion table. |
||
− | |||
− | <pre> |
||
− | $ echo "kaikki ihmiset syntyvät vapaina ja tasavertaisina arvoltaan ja oikeuksiltaan." |\ |
||
− | sed 's/$/¶/g' | sed 's/\W/\n&\n/g' | grep -v '^ $' | hfst-lookup src/mor-omorfi.hwfst |\ |
||
− | python omorfi-to-apertium.py -c |
||
− | |||
− | ^kaikki/kaikki<Pron><Sg><Nom>$ ^ihmiset/ihminen<N><38><Pl><NOM,ACC>$ |
||
− | ^syntyvät/syntyä<V><52><J><Act><Indv><Pres><PL3>/syntyä<V><52><J><Act><VA><Pos><Pl><NOM,ACC>$ |
||
− | ^vapaina/vapaa<A><17><Pos><Pl><Ess>$ ^ja/ja<Part>/ja<Conj>$ |
||
− | ^tasavertaisina/tasavertainen<A><38><Pos><Pl><Ess>/tasa<N><9><Sg><Nom>+vertainen<A><38><Pos><Pl><Ess>$ |
||
− | ^arvoltaan/arvo<N><1><Sg><Abl><SG3,PL3>$ ^ja/ja<Part>/ja<Conj>$ ^oikeuksiltaan/oikeus<N><40><Pl><Abl><SG3,PL3>$ |
||
− | ^./.<Punct>$ |
||
− | |||
− | </pre> |
||
==See also== |
==See also== |
||
Line 108: | Line 41: | ||
==External links== |
==External links== |
||
+ | |||
− | * [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/OMorFiSFSTVersion#Installation OMorFi: Installation] |
||
− | * [ |
+ | * [http://code.google.com/p/omorfi Omorfi project site at google code] |
* [http://langtech.jrc.it/FSMNLP2008/m/Koskenniemi_invited_talk.pdf Overview of the HFST project (pdf)], esp. in relation to other FST technology |
* [http://langtech.jrc.it/FSMNLP2008/m/Koskenniemi_invited_talk.pdf Overview of the HFST project (pdf)], esp. in relation to other FST technology |
||
Latest revision as of 14:53, 2 June 2016
Omorfi (Open Morphology of Finnish) is a computational morphology of Finnish written using HFST.
Requirements[edit]
You will need HFST installed, you can follow the instructions on the HFST page.
Download[edit]
The following commands will download and prepare the build for OMorFi.
$ git clone https://github.com/flammie/omorfi $ cd omorfi/ $ ./autogen.sh $ ./configure
In case autogen.sh does not work, do report a bug (autoreconf -i should work just as well in the meantime).
Compilation[edit]
You need at least 1.5Gb RAM to compile Omorfi, or be willing to let your machine sit around trashing for some hours.
$ make
This will compile everything.
To prepare source code for new apertium language pair, use src/scripts/omor2apertium.sh... or just copy one from an existing pair, such as apertium-fin-eng.
Usage[edit]
After compiling, you can test it with the hfst-lookup
program.
See also[edit]
External links[edit]
- Omorfi project site at google code
- Overview of the HFST project (pdf), esp. in relation to other FST technology