Difference between revisions of "Hfst documentation"
Jump to navigation
Jump to search
(kitwiki→gh) |
|||
(9 intermediate revisions by 2 users not shown) | |||
Line 6: | Line 6: | ||
Hfst consists of a large number of smaller programs, with different functions: |
Hfst consists of a large number of smaller programs, with different functions: |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstCalculate hfst-calculate] |
||
** Compiles SFST files into HFST transducers |
** Compiles SFST files into HFST transducers |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstCompare hfst-compare] |
||
** Compares two transducers, checking for equivalence |
** Compares two transducers, checking for equivalence |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstCompose hfst-compose] |
||
** Composes two transducers |
** Composes two transducers |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstComposeIntersect hfst-compose-intersect] |
||
** Perform intersecting composition on two transducers (typically the morphotactic transducer/lexicon and the morphophonological transducer) |
** Perform intersecting composition on two transducers (typically the morphotactic transducer/lexicon and the morphophonological transducer) |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstConcatenate hfst-concatenate] |
||
** Concatenates two transducers |
** Concatenates two transducers |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstConjunct hfst-conjunct] |
||
** Conjuncts two transducers |
** Conjuncts two transducers |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstDeterminize hfst-determinize] |
||
** Determinize a transducer, i.e. create an equivalent, epsilon-free transducer that has no state with two or more transitions that have the same input and output symbols. |
|||
** |
|||
* [ hfst-diff-test] |
* [ hfst-diff-test] |
||
** |
** |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstConjunct hfst-disjunct] |
||
** Disjuncts two transducers |
** Disjuncts two transducers |
||
* [ hfst-duplicate] |
* [ hfst-duplicate] |
||
Line 28: | Line 28: | ||
* [ hfst-foma-wrapper.sh] |
* [ hfst-foma-wrapper.sh] |
||
** |
** |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstFormat hfst-format] |
||
** Determine HFST transducer format and print it to output. |
** Determine HFST transducer format and print it to output. |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstFst2Fst hfst-fst2fst] |
||
** Converts between Hfst, OpenFst, SFST and Foma transducers |
** Converts between Hfst, OpenFst, SFST and Foma transducers |
||
* [ hfst-fst2pairstrings] |
* [ hfst-fst2pairstrings] |
||
** |
** |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstFst2Strings hfst-fst2strings] |
||
** Display the string pairs recognized by a transducer, i.e. paths that lead from the initial state to a final state. |
** Display the string pairs recognized by a transducer, i.e. paths that lead from the initial state to a final state. |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstFst2Txt hfst-fst2txt] |
||
** Prints transducers in AT&T tabular format |
** Prints transducers in AT&T tabular format |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstHead hfst-head] |
||
** Get N first transducers from an archive. |
** Get N first transducers from an archive. |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstInvert hfst-invert] |
||
** Turn a transducer upside down. |
** Turn a transducer upside down. |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstLexc '''hfst-lexc'''] |
||
** Compile a lexc file into a finite-state transducer |
** Compile a lexc file into a finite-state transducer |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstLexc2Fst hfst-lexc2fst] |
||
** Legacy support for HFST 2 lexc parser, '''use only if you really need the command-line version''' of HFST-based lexc parser |
|||
** |
|||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstLookUp '''hfst-lookup'''] |
||
** lookup, gives ''lemma+analysis'' of wordforms |
** lookup, gives ''lemma+analysis'' of wordforms |
||
* [ hfst-lookup-optimize] |
* [ hfst-lookup-optimize] |
||
** |
** |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstPmatch2Fst hfst-pmatch2fst] |
||
** Compile a pmatch pmscript into an FST (see [https://www.researchgate.net/publication/221504250_Beyond_Morphology_Pattern_Matching_with_FST Beyond Morphology: Pattern Matching with FST] by Lauri Karttunen for what pmatch is) |
|||
** |
|||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstMinimize hfst-minimize] |
||
** Minimize a transducer, i.e. create an equivalent, epsilon-free, deterministic transducer that has as few states as possible. |
|||
** |
|||
* [https://github.com/hfst/hfst/wiki/HfstName hfst-name] |
|||
** Name a transducer or print its name. |
|||
* [ hfst-omor-evaluate] |
* [ hfst-omor-evaluate] |
||
** |
** |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstPairTest hfst-pair-test] |
||
** Test a twol rule file using correspondences of strings. |
|||
** |
|||
* [ hfst-preprocess-for-optimized-lookup-format] |
* [ hfst-preprocess-for-optimized-lookup-format] |
||
** |
** |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstProc hfst-proc] |
||
** A tool for performing morphological analysis and generation with finite state transducers. This program is intended to clone functionality of apertium's lt-toolbox's lt-proc and vislcg3's cg-proc. |
|||
** |
|||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstProject hfst-project] |
||
** Project a transducer towards input or output level. |
|||
** |
|||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstPushWeights hfst-push-weights] |
||
** Push weights of a transducer towards initial state or final states. |
|||
** |
|||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstRegexp2Fst hfst-regexp2fst] |
||
** |
** |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstRemoveEpsilons hfst-remove-epsilons] |
||
** |
** |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstRepeat hfst-repeat] |
||
** |
** |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstReverse hfst-reverse] |
||
** |
** |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstSplit hfst-split] |
||
** |
** |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstStrings2Fst hfst-strings2fst] |
||
** Compiles string-pairs and pair-strings into transducers |
** Compiles string-pairs and pair-strings into transducers |
||
* [ hfst-strip-header] |
* [ hfst-strip-header] |
||
** |
** |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstSubstitute hfst-substitute] |
||
** |
** |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstSubtract hfst-subtract] |
||
** |
** |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstSummarize hfst-summarise] |
||
** Calculates the properties of a transducer |
** Calculates the properties of a transducer |
||
* [ hfst-symbols] |
* [ hfst-symbols] |
||
** |
** |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstTail hfst-tail] |
||
** |
** |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstTwolC '''hfst-twolc'''] |
||
** Compiles a twol (two-level morphophonology) file into an Hfst transducer |
** Compiles a twol (two-level morphophonology) file into an Hfst transducer |
||
* [ hfst-twolc-loc] |
* [ hfst-twolc-loc] |
||
** |
** |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstTxt2Fst hfst-txt2fst] |
||
** Converts AT&T tabular format into binary transducers |
** Converts AT&T tabular format into binary transducers |
||
* [https:// |
* [https://github.com/hfst/hfst/wiki/HfstXfst '''hfst-xfst'''] |
||
** Compiles xfst files into Hfst transducers |
** Compiles xfst files into Hfst transducers |
||
Latest revision as of 09:26, 15 September 2022
Hfst is a compiler for finite state transducers. The best documentation for writing such transducers is still Beesley and Karttunen (2003): Finite State Morphology. There still are some important differences between Hfst and the compilers described in the B&K book.
The Hfst subprograms[edit]
Hfst consists of a large number of smaller programs, with different functions:
- hfst-calculate
- Compiles SFST files into HFST transducers
- hfst-compare
- Compares two transducers, checking for equivalence
- hfst-compose
- Composes two transducers
- hfst-compose-intersect
- Perform intersecting composition on two transducers (typically the morphotactic transducer/lexicon and the morphophonological transducer)
- hfst-concatenate
- Concatenates two transducers
- hfst-conjunct
- Conjuncts two transducers
- hfst-determinize
- Determinize a transducer, i.e. create an equivalent, epsilon-free transducer that has no state with two or more transitions that have the same input and output symbols.
- [ hfst-diff-test]
- hfst-disjunct
- Disjuncts two transducers
- [ hfst-duplicate]
- [ hfst-foma-wrapper.sh]
- hfst-format
- Determine HFST transducer format and print it to output.
- hfst-fst2fst
- Converts between Hfst, OpenFst, SFST and Foma transducers
- [ hfst-fst2pairstrings]
- hfst-fst2strings
- Display the string pairs recognized by a transducer, i.e. paths that lead from the initial state to a final state.
- hfst-fst2txt
- Prints transducers in AT&T tabular format
- hfst-head
- Get N first transducers from an archive.
- hfst-invert
- Turn a transducer upside down.
- hfst-lexc
- Compile a lexc file into a finite-state transducer
- hfst-lexc2fst
- Legacy support for HFST 2 lexc parser, use only if you really need the command-line version of HFST-based lexc parser
- hfst-lookup
- lookup, gives lemma+analysis of wordforms
- [ hfst-lookup-optimize]
- hfst-pmatch2fst
- Compile a pmatch pmscript into an FST (see Beyond Morphology: Pattern Matching with FST by Lauri Karttunen for what pmatch is)
- hfst-minimize
- Minimize a transducer, i.e. create an equivalent, epsilon-free, deterministic transducer that has as few states as possible.
- hfst-name
- Name a transducer or print its name.
- [ hfst-omor-evaluate]
- hfst-pair-test
- Test a twol rule file using correspondences of strings.
- [ hfst-preprocess-for-optimized-lookup-format]
- hfst-proc
- A tool for performing morphological analysis and generation with finite state transducers. This program is intended to clone functionality of apertium's lt-toolbox's lt-proc and vislcg3's cg-proc.
- hfst-project
- Project a transducer towards input or output level.
- hfst-push-weights
- Push weights of a transducer towards initial state or final states.
- hfst-regexp2fst
- hfst-remove-epsilons
- hfst-repeat
- hfst-reverse
- hfst-split
- hfst-strings2fst
- Compiles string-pairs and pair-strings into transducers
- [ hfst-strip-header]
- hfst-substitute
- hfst-subtract
- hfst-summarise
- Calculates the properties of a transducer
- [ hfst-symbols]
- hfst-tail
- hfst-twolc
- Compiles a twol (two-level morphophonology) file into an Hfst transducer
- [ hfst-twolc-loc]
- hfst-txt2fst
- Converts AT&T tabular format into binary transducers
- hfst-xfst
- Compiles xfst files into Hfst transducers