https://wiki.apertium.org/w/api.php?action=feedcontributions&user=78.170.98.79&feedformat=atomApertium - User contributions [en]2024-03-28T23:27:45ZUser contributionsMediaWiki 1.34.1https://wiki.apertium.org/w/index.php?title=Hfst&diff=27470Hfst2011-08-13T17:47:35Z<p>78.170.98.79: /* Prepackaged tarball */</p>
<hr />
<div>{{TOCD}}<br />
'''hfst''' is the Helsinki finite-state toolkit. This is formalism-compatible with both lexc and twolc, so, kind of like [[foma]] is to xfst. It is currently being used in [[apertium-sme-nob]] and [[apertium-fin-sme]].<br />
<br />
The IRC channel is <code>#hfst</code> at <code>irc.freenode.net</code> (you may try [irc://irc.freenode.net/#hfst irc://irc.freenode.net/#hfst] if your browser supports it, or enter #hfst into http://webchat.freenode.net/ if you want a web client).<br />
<br />
==Prerequisites==<br />
<br />
* automake, autoconf, libtool<br />
<br />
HFST is a sort of meta-package with several ''backends''. To do anything useful, you'll need at least one (preferably all) of:<br />
* [[OpenFST]]<br />
* [[SFST]]<br />
* [[Foma]] -- used for lexc and xfst (sequential rewrite rules)<br />
<br />
==Compiling HFST3==<br />
<br />
===Subversion checkout===<br />
<br />
:"MacOS X note: you need XCode installed on your Mac. It came with your computer, and can be downloaded from [http://developer.apple.com/ Apple] (registration required)"<br />
<br />
<pre><br />
$ svn co https://hfst.svn.sourceforge.net/svnroot/hfst/trunk hfst <br />
$ cd hfst/hfst3/<br />
$ sh autogen.sh<br />
$ ./configure --prefix=/home/USERNAME/local/ # remove --prefix if you just want it in /usr/local<br />
$ make<br />
$ sudo make install<br />
</pre><br />
<br />
===Prepackaged tarball===<br />
<br />
Download the latest version from [http://sourceforge.net/projects/hfst/files/], and unzip. Then follow the instructions in the README file, i.e.:<br />
<br />
<pre><br />
$ cd hfst-3.0/<br />
$ sh autogen.sh<br />
$ ./configure<br />
$ make<br />
$ sudo make install<br />
$ sudo ldconfig<br />
</pre><br />
<br />
===Troubleshooting===<br />
<br />
If, during the ./configure step, you see<pre>checking for GNU libc compatible malloc... no<br />
[…]<br />
checking for GNU libc compatible realloc... no</pre> and then during make a bunch of errors like: <pre>/usr/local/include/sfst/mem.h:37:57: error: 'malloc' was not declared in this scope</pre>, try the following:<br />
<br />
<pre>sudo ldconfig<br />
export LD_LIBRARY_PATH=/usr/local/lib<br />
export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig</pre><br />
<br />
and then ./configure and make.<br />
<br />
For more advices on installation problems, have a look at [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstReadme the Hfst Readme page].<br />
<br />
==Compiling HFST2==<br />
There are some regressions in HFST3 that make it impossible to use with [[apertium-sme-nob]] yet (last tested: revision 1204). <br />
<br />
Use revision 627 of HFST2:<br />
<br />
<pre><br />
$ svn co -r627 https://hfst.svn.sourceforge.net/svnroot/hfst/branches/hfst2<br />
$ cd hfst2<br />
$ autoreconf -i<br />
$ ./configure --prefix=/home/fran/local/ # remove --prefix if you just want it in /usr/local<br />
$ make<br />
$ sudo make install<br />
</pre><br />
<br />
==Using==<br />
<br />
<pre><br />
$ svn co https://victorio.uit.no/langtech/trunk/st/fao<br />
$ cd fao/src<br />
$ make -f Makefile.hfst<br />
<br />
$ echo "orð" | hfst-lookup ../bin/fao-morph.hfst<br />
lookup> <br />
orð orð+N+Neu+Sg+Nom+Indef<br />
orð orð+N+Neu+Sg+Acc+Indef<br />
orð orð+N+Neu+Pl+Nom+Indef<br />
orð orð+N+Neu+Pl+Acc+Indef<br />
<br />
lookup><br />
$<br />
<br />
</pre><br />
<br />
To compile <code>lexc</code> code, first concatenate all the lexc files:<br />
<br />
<pre><br />
$ cat fao-lex.txt noun-fao-lex.txt noun-fao-morph.txt adj-fao-lex.txt \<br />
adj-fao-morph.txt verb-fao-lex.txt verb-fao-morph.txt adv-fao-lex.txt \<br />
abbr-fao-lex.txt acro-fao-lex.txt pron-fao-lex.txt punct-fao-lex.txt \<br />
numeral-fao-lex.txt pp-fao-lex.txt cc-fao-lex.txt cs-fao-lex.txt \<br />
interj-fao-lex.txt det-fao-lex.txt > ../tmp/lexc-all.txt<br />
</pre><br />
<br />
To compile this, just use the <code>hfst-lexc</code> program,<br />
<br />
<pre><br />
hfst-lexc < ../tmp/lexc-all.txt > ../bin/lexc-fao.bin<br />
</pre><br />
<br />
To compile the <code>twol</code> rules, just use the <code>hfst-twolc</code> program,<br />
<br />
<pre><br />
$ hfst-twolc twol-fao.txt > twol-fao.bin<br />
</pre><br />
<br />
And then to compose the lexicon and rule file, use <code>hfst-compose-intersect</code>:<br />
<br />
<pre><br />
$ hfst-compose-intersect -l lexc-fao.bin twol-fao.bin -o fao-gen.hfst<br />
</pre><br />
<br />
This will create a generator, if you want an analyser, you just need to invert the generator with <code>hfst-invert</code>:<br />
<br />
<pre><br />
$ hfst-invert fao-gen.hfst -o fao-morph.hfst<br />
</pre><br />
<br />
==HFST2 vs HFST3==<br />
There have been some changes. Notably:<br />
<br />
* In twol files, a <code>/</code> in alphabetic symbols has to be escaped, e.g. <code>%+Der%/st</code> instead of <code>%+Der/st</code>.<br />
* In twol files, you can no longer have Sets on the left-hand side of a rule, so write <code>Vx:Vy /<= _ ; where Vx in Set1 Vy in Set2 ;</code> where you before would have <code>Set1:Set2 /<= _ ;</code><br />
<br />
* The old <code>-r</code> option to hfst-twolc is now uppercase: <code>-R</code> <br />
* hfst-lookup-optimize is gone, use instead <code>hfst-fst2fst -O -i infile.hfst -o outfile.hfst.ol</code><br />
* hfst-lexc needs the outfile option to be before the lexc (input), e.g. <code>hfst-lexc -o outfile.hfst mylexicon.lexc</code><br />
* hfst-compose-intersect uses <code>-1</code> (number one) instead of <code>-l</code> (letter L), and <code>-2</code> for the rule-file. E.g. <code>hfst-compose-intersect -1 lexicon.hfst -2 rules.twol.hfst -o generator.hfst</code><br />
<br />
==See also==<br />
<br />
* [[Starting a new language with HFST]]<br />
<br />
==External links==<br />
<br />
* [http://www.ling.helsinki.fi/kieliteknologia/tutkimus/hfst/ Helsinki Finite-State Transducer Technology (HFST)]<br />
<br />
[[Category:Morphological analysers]]<br />
[[Category:HFST]]</div>78.170.98.79https://wiki.apertium.org/w/index.php?title=Hfst&diff=27462Hfst2011-08-13T16:14:36Z<p>78.170.98.79: /* Prepackaged tarball */</p>
<hr />
<div>{{TOCD}}<br />
'''hfst''' is the Helsinki finite-state toolkit. This is formalism-compatible with both lexc and twolc, so, kind of like [[foma]] is to xfst. It is currently being used in [[apertium-sme-nob]] and [[apertium-fin-sme]].<br />
<br />
The IRC channel is <code>#hfst</code> at <code>irc.freenode.net</code> (you may try [irc://irc.freenode.net/#hfst irc://irc.freenode.net/#hfst] if your browser supports it, or enter #hfst into http://webchat.freenode.net/ if you want a web client).<br />
<br />
==Prerequisites==<br />
<br />
* automake, autoconf, libtool<br />
<br />
HFST is a sort of meta-package with several ''backends''. To do anything useful, you'll need at least one (preferably all) of:<br />
* [[OpenFST]]<br />
* [[SFST]]<br />
* [[Foma]] -- used for lexc and xfst (sequential rewrite rules)<br />
<br />
==Compiling HFST3==<br />
<br />
===Subversion checkout===<br />
<br />
:"MacOS X note: you need XCode installed on your Mac. It came with your computer, and can be downloaded from [http://developer.apple.com/ Apple] (registration required)"<br />
<br />
<pre><br />
$ svn co https://hfst.svn.sourceforge.net/svnroot/hfst/trunk hfst <br />
$ cd hfst/hfst3/<br />
$ sh autogen.sh<br />
$ ./configure --prefix=/home/USERNAME/local/ # remove --prefix if you just want it in /usr/local<br />
$ make<br />
$ sudo make install<br />
</pre><br />
<br />
===Prepackaged tarball===<br />
<br />
Download the latest version from [http://sourceforge.net/projects/hfst/files/], and unzip. Then follow the instructions in the README file, i.e.:<br />
<br />
<pre><br />
$ cd hfst-3.0/<br />
$ sh autogen.sh<br />
$ ./configure<br />
$ make<br />
$ sudo make install<br />
</pre><br />
<br />
===Troubleshooting===<br />
<br />
If, during the ./configure step, you see<pre>checking for GNU libc compatible malloc... no<br />
[…]<br />
checking for GNU libc compatible realloc... no</pre> and then during make a bunch of errors like: <pre>/usr/local/include/sfst/mem.h:37:57: error: 'malloc' was not declared in this scope</pre>, try the following:<br />
<br />
<pre>sudo ldconfig<br />
export LD_LIBRARY_PATH=/usr/local/lib<br />
export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig</pre><br />
<br />
and then ./configure and make.<br />
<br />
For more advices on installation problems, have a look at [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstReadme the Hfst Readme page].<br />
<br />
==Compiling HFST2==<br />
There are some regressions in HFST3 that make it impossible to use with [[apertium-sme-nob]] yet (last tested: revision 1204). <br />
<br />
Use revision 627 of HFST2:<br />
<br />
<pre><br />
$ svn co -r627 https://hfst.svn.sourceforge.net/svnroot/hfst/branches/hfst2<br />
$ cd hfst2<br />
$ autoreconf -i<br />
$ ./configure --prefix=/home/fran/local/ # remove --prefix if you just want it in /usr/local<br />
$ make<br />
$ sudo make install<br />
</pre><br />
<br />
==Using==<br />
<br />
<pre><br />
$ svn co https://victorio.uit.no/langtech/trunk/st/fao<br />
$ cd fao/src<br />
$ make -f Makefile.hfst<br />
<br />
$ echo "orð" | hfst-lookup ../bin/fao-morph.hfst<br />
lookup> <br />
orð orð+N+Neu+Sg+Nom+Indef<br />
orð orð+N+Neu+Sg+Acc+Indef<br />
orð orð+N+Neu+Pl+Nom+Indef<br />
orð orð+N+Neu+Pl+Acc+Indef<br />
<br />
lookup><br />
$<br />
<br />
</pre><br />
<br />
To compile <code>lexc</code> code, first concatenate all the lexc files:<br />
<br />
<pre><br />
$ cat fao-lex.txt noun-fao-lex.txt noun-fao-morph.txt adj-fao-lex.txt \<br />
adj-fao-morph.txt verb-fao-lex.txt verb-fao-morph.txt adv-fao-lex.txt \<br />
abbr-fao-lex.txt acro-fao-lex.txt pron-fao-lex.txt punct-fao-lex.txt \<br />
numeral-fao-lex.txt pp-fao-lex.txt cc-fao-lex.txt cs-fao-lex.txt \<br />
interj-fao-lex.txt det-fao-lex.txt > ../tmp/lexc-all.txt<br />
</pre><br />
<br />
To compile this, just use the <code>hfst-lexc</code> program,<br />
<br />
<pre><br />
hfst-lexc < ../tmp/lexc-all.txt > ../bin/lexc-fao.bin<br />
</pre><br />
<br />
To compile the <code>twol</code> rules, just use the <code>hfst-twolc</code> program,<br />
<br />
<pre><br />
$ hfst-twolc twol-fao.txt > twol-fao.bin<br />
</pre><br />
<br />
And then to compose the lexicon and rule file, use <code>hfst-compose-intersect</code>:<br />
<br />
<pre><br />
$ hfst-compose-intersect -l lexc-fao.bin twol-fao.bin -o fao-gen.hfst<br />
</pre><br />
<br />
This will create a generator, if you want an analyser, you just need to invert the generator with <code>hfst-invert</code>:<br />
<br />
<pre><br />
$ hfst-invert fao-gen.hfst -o fao-morph.hfst<br />
</pre><br />
<br />
==HFST2 vs HFST3==<br />
There have been some changes. Notably:<br />
<br />
* In twol files, a <code>/</code> in alphabetic symbols has to be escaped, e.g. <code>%+Der%/st</code> instead of <code>%+Der/st</code>.<br />
* In twol files, you can no longer have Sets on the left-hand side of a rule, so write <code>Vx:Vy /<= _ ; where Vx in Set1 Vy in Set2 ;</code> where you before would have <code>Set1:Set2 /<= _ ;</code><br />
<br />
* The old <code>-r</code> option to hfst-twolc is now uppercase: <code>-R</code> <br />
* hfst-lookup-optimize is gone, use instead <code>hfst-fst2fst -O -i infile.hfst -o outfile.hfst.ol</code><br />
* hfst-lexc needs the outfile option to be before the lexc (input), e.g. <code>hfst-lexc -o outfile.hfst mylexicon.lexc</code><br />
* hfst-compose-intersect uses <code>-1</code> (number one) instead of <code>-l</code> (letter L), and <code>-2</code> for the rule-file. E.g. <code>hfst-compose-intersect -1 lexicon.hfst -2 rules.twol.hfst -o generator.hfst</code><br />
<br />
==See also==<br />
<br />
* [[Starting a new language with HFST]]<br />
<br />
==External links==<br />
<br />
* [http://www.ling.helsinki.fi/kieliteknologia/tutkimus/hfst/ Helsinki Finite-State Transducer Technology (HFST)]<br />
<br />
[[Category:Morphological analysers]]<br />
[[Category:HFST]]</div>78.170.98.79