Difference between revisions of "Hfst"
Jump to navigation
Jump to search
(→Using) |
|||
Line 61: | Line 61: | ||
</pre> |
</pre> |
||
To compile this, just use the <code>hfst-lexc</code> program, |
|||
<pre> |
<pre> |
||
hfst-lexc < ../tmp/lexc-all.txt > ../bin/lexc-fao.bin |
|||
$ hfst-lexc |
|||
... |
|||
lexc> compile-source lexc-all.txt |
|||
... |
|||
Minimizing...Done! |
|||
lexc> save-source lexc-fao.bin |
|||
opening "lexc-fao.bin" |
|||
Opening 'lexc-fao.bin'... |
|||
Done. |
|||
lexc> quit |
|||
</pre> |
</pre> |
||
Revision as of 13:03, 18 November 2009
hfst is the Helsinki finite-state toolkit. This is formalism-compatible with both lexc and twolc, so, kind of like foma is to xfst.
Prerequisites
- automake, autoconf, libtool
Compiling
Subversion checkout
- "MacOS X note: you need XCode installed on your Mac. It came with your computer, and can be downloaded from Apple (registration required)"
$ svn co https://hfst.svn.sourceforge.net/svnroot/hfst/trunk hfst $ cd hfst/hfst/ $ autoreconf -i $ ./configure --prefix=/home/fran/local/ $ make $ sudo make install
Prepackaged tarball
Download the latest version from [1], and unzip. Then follow the instructions in the README file, i.e.:
$ cd hfst-2.0/ $ ./configure $ make $ sudo make install
Using
$ svn co https://victorio.uit.no/langtech/trunk/st/fao $ cd fao/src $ make -f Makefile.hfst $ echo "orð" | hfst-lookup ../bin/fao-morph.hfst lookup> orð orð+N+Neu+Sg+Nom+Indef orð orð+N+Neu+Sg+Acc+Indef orð orð+N+Neu+Pl+Nom+Indef orð orð+N+Neu+Pl+Acc+Indef lookup> $
To compile lexc
code, first concatenate all the lexc files:
$ cat fao-lex.txt noun-fao-lex.txt noun-fao-morph.txt adj-fao-lex.txt \ adj-fao-morph.txt verb-fao-lex.txt verb-fao-morph.txt adv-fao-lex.txt \ abbr-fao-lex.txt acro-fao-lex.txt pron-fao-lex.txt punct-fao-lex.txt \ numeral-fao-lex.txt pp-fao-lex.txt cc-fao-lex.txt cs-fao-lex.txt \ interj-fao-lex.txt det-fao-lex.txt > ../tmp/lexc-all.txt
To compile this, just use the hfst-lexc
program,
hfst-lexc < ../tmp/lexc-all.txt > ../bin/lexc-fao.bin
To compile the twol
rules, just use the hfst-twolc
program,
$ hfst-twolc twol-fao.txt > twol-fao.bin
And then to compose the lexicon and rule file, use hfst-compose-intersect
:
$ hfst-compose-intersect -l lexc-fao.bin twol-fao.bin -o fao-gen.hfst
This will create a generator, if you want an analyser, you just need to invert the generator with hfst-invert
:
$ hfst-invert fao-gen.hfst -o fao-morph.hfst