Difference between revisions of "Hfst"

From Apertium
Jump to navigation Jump to search
Line 36: Line 36:
 
<pre>
 
<pre>
 
$ svn co https://victorio.uit.no/langtech/trunk/st/fao
 
$ svn co https://victorio.uit.no/langtech/trunk/st/fao
$ cd fao
+
$ cd fao/src
  +
$ make -f Makefile.hfst
$ hfst-twolc < fao-twol.txt > fao-twol.bin
 
   
  +
$ echo "orð" | hfst-lookup ../bin/fao-morph.hfst
[...]
 
  +
lookup>
  +
orð orð+N+Neu+Sg+Nom+Indef
  +
orð orð+N+Neu+Sg+Acc+Indef
  +
orð orð+N+Neu+Pl+Nom+Indef
  +
orð orð+N+Neu+Pl+Acc+Indef
   
  +
lookup>
  +
$
   
  +
</pre>
  +
  +
To compile <code>lexc</code> code, first concatenate all the lexc files:
  +
  +
<pre>
  +
$ cat fao-lex.txt noun-fao-lex.txt noun-fao-morph.txt adj-fao-lex.txt \
  +
adj-fao-morph.txt verb-fao-lex.txt verb-fao-morph.txt adv-fao-lex.txt \
  +
abbr-fao-lex.txt acro-fao-lex.txt pron-fao-lex.txt punct-fao-lex.txt \
  +
numeral-fao-lex.txt pp-fao-lex.txt cc-fao-lex.txt cs-fao-lex.txt \
  +
interj-fao-lex.txt det-fao-lex.txt > ../tmp/lexc-all.txt
  +
</pre>
  +
  +
Then, open the <code>hfst-lexc</code> program, and do <code>compile-source</code> and <code>save-source</code>:
  +
  +
<pre>
  +
$ hfst-lexc
  +
 
...
  +
  +
lexc> compile-source ../tmp/lexc-all.txt
  +
  +
...
  +
  +
Minimizing...Done!
  +
lexc> save-source ../bin/lexc-fao.bin
  +
opening "../bin/lexc-fao.bin"
  +
Opening '../bin/lexc-fao.bin'...
  +
Done.
  +
lexc> quit
  +
</pre>
  +
  +
To compile the <code>twol</code> rules, just use the <code>hfst-twolc</code> program,
  +
  +
<pre>
 
$ hfst-twolc twol-fao.txt > twol-fao.bin
  +
</pre>
  +
  +
And then to compose the lexicon and rule file, use <code>hfst-compose-intersect</code>:
  +
  +
<pre>
  +
$ hfst-compose-intersect -l lexc-fao.bin twol-fao.bin -o fao-gen.hfst
  +
</pre>
  +
  +
This will create a generator, if you want an analyser, you just need to invert the generator with <code>hfst-invert</code>:
  +
  +
<pre>
  +
$ hfst-invert ../bin/fao-gen.hfst -o fao-morph.hfst
 
</pre>
 
</pre>
   

Revision as of 22:59, 26 October 2009

hfst is the Helsinki finite-state toolkit. This is formalism-compatible with both lexc and twolc, so, kind of like foma is to xfst.

Prerequisites

  • automake, autoconf, libtool

Compiling

Subversion checkout

"MacOS X note: you need XCode installed on your Mac. It came with your computer, and can be downloaded from Apple (registration required)"
$ svn co https://hfst.svn.sourceforge.net/svnroot/hfst/trunk hfst 
$ cd hfst/hfst-2.0/
$ autoreconf -i
$ ./configure --prefix=/home/fran/local/
$ make
$ sudo make install

Prepackaged tarball

Download the latest version from [1], and unzip. Then follow the instructions in the README file, i.e.:

$ cd hfst-2.0/
$ ./configure
$ make
$ sudo make install

Using

$ svn co https://victorio.uit.no/langtech/trunk/st/fao
$ cd fao/src
$ make -f Makefile.hfst

$ echo "orð" | hfst-lookup ../bin/fao-morph.hfst
lookup> 
orð	orð+N+Neu+Sg+Nom+Indef
orð	orð+N+Neu+Sg+Acc+Indef
orð	orð+N+Neu+Pl+Nom+Indef
orð	orð+N+Neu+Pl+Acc+Indef

lookup>
$

To compile lexc code, first concatenate all the lexc files:

$ cat fao-lex.txt noun-fao-lex.txt noun-fao-morph.txt adj-fao-lex.txt \
adj-fao-morph.txt verb-fao-lex.txt verb-fao-morph.txt adv-fao-lex.txt \
abbr-fao-lex.txt acro-fao-lex.txt pron-fao-lex.txt punct-fao-lex.txt \
numeral-fao-lex.txt pp-fao-lex.txt cc-fao-lex.txt cs-fao-lex.txt \
interj-fao-lex.txt det-fao-lex.txt > ../tmp/lexc-all.txt

Then, open the hfst-lexc program, and do compile-source and save-source:

$ hfst-lexc

...

lexc> compile-source ../tmp/lexc-all.txt

...

Minimizing...Done!
lexc> save-source ../bin/lexc-fao.bin
opening "../bin/lexc-fao.bin"
Opening '../bin/lexc-fao.bin'...
Done.
lexc> quit

To compile the twol rules, just use the hfst-twolc program,

$ hfst-twolc twol-fao.txt > twol-fao.bin

And then to compose the lexicon and rule file, use hfst-compose-intersect:

$ hfst-compose-intersect -l lexc-fao.bin twol-fao.bin -o fao-gen.hfst

This will create a generator, if you want an analyser, you just need to invert the generator with hfst-invert:

$ hfst-invert ../bin/fao-gen.hfst -o fao-morph.hfst

External links