Difference between revisions of "Hfst documentation"

From Apertium
Jump to navigation Jump to search
(kitwiki→gh)
 
Line 6: Line 6:
 
Hfst consists of a large number of smaller programs, with different functions:
 
Hfst consists of a large number of smaller programs, with different functions:
   
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstCalculate hfst-calculate]
+
* [https://github.com/hfst/hfst/wiki/HfstCalculate hfst-calculate]
 
** Compiles SFST files into HFST transducers
 
** Compiles SFST files into HFST transducers
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstCompare hfst-compare]
+
* [https://github.com/hfst/hfst/wiki/HfstCompare hfst-compare]
 
** Compares two transducers, checking for equivalence
 
** Compares two transducers, checking for equivalence
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstCompose hfst-compose]
+
* [https://github.com/hfst/hfst/wiki/HfstCompose hfst-compose]
 
** Composes two transducers
 
** Composes two transducers
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstComposeIntersect hfst-compose-intersect]
+
* [https://github.com/hfst/hfst/wiki/HfstComposeIntersect hfst-compose-intersect]
 
** Perform intersecting composition on two transducers (typically the morphotactic transducer/lexicon and the morphophonological transducer)
 
** Perform intersecting composition on two transducers (typically the morphotactic transducer/lexicon and the morphophonological transducer)
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstConcatenate hfst-concatenate]
+
* [https://github.com/hfst/hfst/wiki/HfstConcatenate hfst-concatenate]
 
** Concatenates two transducers
 
** Concatenates two transducers
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstConjunct hfst-conjunct]
+
* [https://github.com/hfst/hfst/wiki/HfstConjunct hfst-conjunct]
 
** Conjuncts two transducers
 
** Conjuncts two transducers
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstDeterminize hfst-determinize]
+
* [https://github.com/hfst/hfst/wiki/HfstDeterminize hfst-determinize]
 
** Determinize a transducer, i.e. create an equivalent, epsilon-free transducer that has no state with two or more transitions that have the same input and output symbols.
 
** Determinize a transducer, i.e. create an equivalent, epsilon-free transducer that has no state with two or more transitions that have the same input and output symbols.
 
* [ hfst-diff-test]
 
* [ hfst-diff-test]
 
**
 
**
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstConjunct hfst-disjunct]
+
* [https://github.com/hfst/hfst/wiki/HfstConjunct hfst-disjunct]
 
** Disjuncts two transducers
 
** Disjuncts two transducers
 
* [ hfst-duplicate]
 
* [ hfst-duplicate]
Line 28: Line 28:
 
* [ hfst-foma-wrapper.sh]
 
* [ hfst-foma-wrapper.sh]
 
**
 
**
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstFormat hfst-format]
+
* [https://github.com/hfst/hfst/wiki/HfstFormat hfst-format]
 
** Determine HFST transducer format and print it to output.
 
** Determine HFST transducer format and print it to output.
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstFst2Fst hfst-fst2fst]
+
* [https://github.com/hfst/hfst/wiki/HfstFst2Fst hfst-fst2fst]
 
** Converts between Hfst, OpenFst, SFST and Foma transducers
 
** Converts between Hfst, OpenFst, SFST and Foma transducers
 
* [ hfst-fst2pairstrings]
 
* [ hfst-fst2pairstrings]
 
**
 
**
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstFst2Strings hfst-fst2strings]
+
* [https://github.com/hfst/hfst/wiki/HfstFst2Strings hfst-fst2strings]
 
** Display the string pairs recognized by a transducer, i.e. paths that lead from the initial state to a final state.
 
** Display the string pairs recognized by a transducer, i.e. paths that lead from the initial state to a final state.
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstFst2Txt hfst-fst2txt]
+
* [https://github.com/hfst/hfst/wiki/HfstFst2Txt hfst-fst2txt]
 
** Prints transducers in AT&T tabular format
 
** Prints transducers in AT&T tabular format
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstHead hfst-head]
+
* [https://github.com/hfst/hfst/wiki/HfstHead hfst-head]
 
** Get N first transducers from an archive.
 
** Get N first transducers from an archive.
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstInvert hfst-invert]
+
* [https://github.com/hfst/hfst/wiki/HfstInvert hfst-invert]
 
** Turn a transducer upside down.
 
** Turn a transducer upside down.
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstLexc '''hfst-lexc''']
+
* [https://github.com/hfst/hfst/wiki/HfstLexc '''hfst-lexc''']
 
** Compile a lexc file into a finite-state transducer
 
** Compile a lexc file into a finite-state transducer
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstLexc2Fst hfst-lexc2fst]
+
* [https://github.com/hfst/hfst/wiki/HfstLexc2Fst hfst-lexc2fst]
 
** Legacy support for HFST 2 lexc parser, '''use only if you really need the command-line version''' of HFST-based lexc parser
 
** Legacy support for HFST 2 lexc parser, '''use only if you really need the command-line version''' of HFST-based lexc parser
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstLookUp '''hfst-lookup''']
+
* [https://github.com/hfst/hfst/wiki/HfstLookUp '''hfst-lookup''']
 
** lookup, gives ''lemma+analysis'' of wordforms
 
** lookup, gives ''lemma+analysis'' of wordforms
 
* [ hfst-lookup-optimize]
 
* [ hfst-lookup-optimize]
 
**
 
**
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstPmatch2Fst hfst-pmatch2fst]
+
* [https://github.com/hfst/hfst/wiki/HfstPmatch2Fst hfst-pmatch2fst]
 
** Compile a pmatch pmscript into an FST (see [https://www.researchgate.net/publication/221504250_Beyond_Morphology_Pattern_Matching_with_FST Beyond Morphology: Pattern Matching with FST] by Lauri Karttunen for what pmatch is)
 
** Compile a pmatch pmscript into an FST (see [https://www.researchgate.net/publication/221504250_Beyond_Morphology_Pattern_Matching_with_FST Beyond Morphology: Pattern Matching with FST] by Lauri Karttunen for what pmatch is)
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstMinimize hfst-minimize]
+
* [https://github.com/hfst/hfst/wiki/HfstMinimize hfst-minimize]
 
** Minimize a transducer, i.e. create an equivalent, epsilon-free, deterministic transducer that has as few states as possible.
 
** Minimize a transducer, i.e. create an equivalent, epsilon-free, deterministic transducer that has as few states as possible.
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstName hfst-name]
+
* [https://github.com/hfst/hfst/wiki/HfstName hfst-name]
 
** Name a transducer or print its name.
 
** Name a transducer or print its name.
 
* [ hfst-omor-evaluate]
 
* [ hfst-omor-evaluate]
 
**
 
**
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstPairTest hfst-pair-test]
+
* [https://github.com/hfst/hfst/wiki/HfstPairTest hfst-pair-test]
 
** Test a twol rule file using correspondences of strings.
 
** Test a twol rule file using correspondences of strings.
 
* [ hfst-preprocess-for-optimized-lookup-format]
 
* [ hfst-preprocess-for-optimized-lookup-format]
 
**
 
**
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstProc hfst-proc]
+
* [https://github.com/hfst/hfst/wiki/HfstProc hfst-proc]
 
** A tool for performing morphological analysis and generation with finite state transducers. This program is intended to clone functionality of apertium's lt-toolbox's lt-proc and vislcg3's cg-proc.
 
** A tool for performing morphological analysis and generation with finite state transducers. This program is intended to clone functionality of apertium's lt-toolbox's lt-proc and vislcg3's cg-proc.
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstProject hfst-project]
+
* [https://github.com/hfst/hfst/wiki/HfstProject hfst-project]
 
** Project a transducer towards input or output level.
 
** Project a transducer towards input or output level.
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstPushWeights hfst-push-weights]
+
* [https://github.com/hfst/hfst/wiki/HfstPushWeights hfst-push-weights]
 
** Push weights of a transducer towards initial state or final states.
 
** Push weights of a transducer towards initial state or final states.
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstRegexp2Fst hfst-regexp2fst]
+
* [https://github.com/hfst/hfst/wiki/HfstRegexp2Fst hfst-regexp2fst]
 
**
 
**
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstRemoveEpsilons hfst-remove-epsilons]
+
* [https://github.com/hfst/hfst/wiki/HfstRemoveEpsilons hfst-remove-epsilons]
 
**
 
**
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstRepeat hfst-repeat]
+
* [https://github.com/hfst/hfst/wiki/HfstRepeat hfst-repeat]
 
**
 
**
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstReverse hfst-reverse]
+
* [https://github.com/hfst/hfst/wiki/HfstReverse hfst-reverse]
 
**
 
**
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstSplit hfst-split]
+
* [https://github.com/hfst/hfst/wiki/HfstSplit hfst-split]
 
**
 
**
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstStrings2Fst hfst-strings2fst]
+
* [https://github.com/hfst/hfst/wiki/HfstStrings2Fst hfst-strings2fst]
 
** Compiles string-pairs and pair-strings into transducers
 
** Compiles string-pairs and pair-strings into transducers
 
* [ hfst-strip-header]
 
* [ hfst-strip-header]
 
**
 
**
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstSubstitute hfst-substitute]
+
* [https://github.com/hfst/hfst/wiki/HfstSubstitute hfst-substitute]
 
**
 
**
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstSubtract hfst-subtract]
+
* [https://github.com/hfst/hfst/wiki/HfstSubtract hfst-subtract]
 
**
 
**
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstSummarize hfst-summarise]
+
* [https://github.com/hfst/hfst/wiki/HfstSummarize hfst-summarise]
 
** Calculates the properties of a transducer
 
** Calculates the properties of a transducer
 
* [ hfst-symbols]
 
* [ hfst-symbols]
 
**
 
**
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstTail hfst-tail]
+
* [https://github.com/hfst/hfst/wiki/HfstTail hfst-tail]
 
**
 
**
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstTwolC '''hfst-twolc''']
+
* [https://github.com/hfst/hfst/wiki/HfstTwolC '''hfst-twolc''']
 
** Compiles a twol (two-level morphophonology) file into an Hfst transducer
 
** Compiles a twol (two-level morphophonology) file into an Hfst transducer
 
* [ hfst-twolc-loc]
 
* [ hfst-twolc-loc]
 
**
 
**
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstTxt2Fst hfst-txt2fst]
+
* [https://github.com/hfst/hfst/wiki/HfstTxt2Fst hfst-txt2fst]
 
** Converts AT&T tabular format into binary transducers
 
** Converts AT&T tabular format into binary transducers
* [https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstXfst '''hfst-xfst''']
+
* [https://github.com/hfst/hfst/wiki/HfstXfst '''hfst-xfst''']
 
** Compiles xfst files into Hfst transducers
 
** Compiles xfst files into Hfst transducers
   

Latest revision as of 09:26, 15 September 2022

Hfst is a compiler for finite state transducers. The best documentation for writing such transducers is still Beesley and Karttunen (2003): Finite State Morphology. There still are some important differences between Hfst and the compilers described in the B&K book.


The Hfst subprograms[edit]

Hfst consists of a large number of smaller programs, with different functions:

  • hfst-calculate
    • Compiles SFST files into HFST transducers
  • hfst-compare
    • Compares two transducers, checking for equivalence
  • hfst-compose
    • Composes two transducers
  • hfst-compose-intersect
    • Perform intersecting composition on two transducers (typically the morphotactic transducer/lexicon and the morphophonological transducer)
  • hfst-concatenate
    • Concatenates two transducers
  • hfst-conjunct
    • Conjuncts two transducers
  • hfst-determinize
    • Determinize a transducer, i.e. create an equivalent, epsilon-free transducer that has no state with two or more transitions that have the same input and output symbols.
  • [ hfst-diff-test]
  • hfst-disjunct
    • Disjuncts two transducers
  • [ hfst-duplicate]
  • [ hfst-foma-wrapper.sh]
  • hfst-format
    • Determine HFST transducer format and print it to output.
  • hfst-fst2fst
    • Converts between Hfst, OpenFst, SFST and Foma transducers
  • [ hfst-fst2pairstrings]
  • hfst-fst2strings
    • Display the string pairs recognized by a transducer, i.e. paths that lead from the initial state to a final state.
  • hfst-fst2txt
    • Prints transducers in AT&T tabular format
  • hfst-head
    • Get N first transducers from an archive.
  • hfst-invert
    • Turn a transducer upside down.
  • hfst-lexc
    • Compile a lexc file into a finite-state transducer
  • hfst-lexc2fst
    • Legacy support for HFST 2 lexc parser, use only if you really need the command-line version of HFST-based lexc parser
  • hfst-lookup
    • lookup, gives lemma+analysis of wordforms
  • [ hfst-lookup-optimize]
  • hfst-pmatch2fst
  • hfst-minimize
    • Minimize a transducer, i.e. create an equivalent, epsilon-free, deterministic transducer that has as few states as possible.
  • hfst-name
    • Name a transducer or print its name.
  • [ hfst-omor-evaluate]
  • hfst-pair-test
    • Test a twol rule file using correspondences of strings.
  • [ hfst-preprocess-for-optimized-lookup-format]
  • hfst-proc
    • A tool for performing morphological analysis and generation with finite state transducers. This program is intended to clone functionality of apertium's lt-toolbox's lt-proc and vislcg3's cg-proc.
  • hfst-project
    • Project a transducer towards input or output level.
  • hfst-push-weights
    • Push weights of a transducer towards initial state or final states.
  • hfst-regexp2fst
  • hfst-remove-epsilons
  • hfst-repeat
  • hfst-reverse
  • hfst-split
  • hfst-strings2fst
    • Compiles string-pairs and pair-strings into transducers
  • [ hfst-strip-header]
  • hfst-substitute
  • hfst-subtract
  • hfst-summarise
    • Calculates the properties of a transducer
  • [ hfst-symbols]
  • hfst-tail
  • hfst-twolc
    • Compiles a twol (two-level morphophonology) file into an Hfst transducer
  • [ hfst-twolc-loc]
  • hfst-txt2fst
    • Converts AT&T tabular format into binary transducers
  • hfst-xfst
    • Compiles xfst files into Hfst transducers

See also[edit]