Morphological segmentation

From Apertium
Jump to navigation Jump to search


In the apertium-kaz/.deps (or any Apertium Turkic language) directory to get the segmented output:

hfst-invert kaz.LR.lexc.hfst -o kaz.LR.lexc.hfst.inv
hfst-compose-intersect -2 kaz.LR.hfst -1 kaz.LR.lexc.hfst.inv -o kaz.seg
hfst-invert kaz.seg | hfst-fst2fst -O -o kaz.segmenter 

$ echo "щеткадағы" | hfst-proc kaz.segmenter 
^щеткадағы/щетка>{D}{A}{G}{I}$

You can then feed this output through the script morph-to-lattice.py (change the 'style' option in the script to 0 to get Moses-style PLF output)