Talk:Hfst
Revision as of 08:59, 3 December 2009 by Francis Tyers (talk | contribs) (Created page with '==What is it?== <pre> <jacobEo> why use that? <spectie> because it has a really expressive formalism for languages with complex morphology, like Finnish, Sami and Basque <jacobEo…')
What is it?
<jacobEo> why use that? <spectie> because it has a really expressive formalism for languages with complex morphology, like Finnish, Sami and Basque <jacobEo> could you give an example of the most important thing it can do that lttoolbox cant? <spectie> stem internal variation <jacobEo> and how does it do that? <spectie> by composing different transducers <spectie> jacobEo, e.g. you have your lexical transducer, then you have your phonological transducer and you compose the two <spectie> jacobEo, it's like the postgeneration in apertium, but much more integrated <spectie> it's like taking care of "live^ed" --> "lived" and "jump^ed" --> "jumped" <spectie> instead of having two paradigms for "live" and "jump" you would have one paradigm <spectie> +ed <spectie> then you would have a phonological rule that says "at morpheme boundaries , collapse e^e -> e <jacobEo> k <Unhammer> OTOH, if your verb paradigm looks like this: <Unhammer> ROOT <p2><sg><pri> <Unhammer> ROOT+t <p2><pl><pri> <Unhammer> v+ROOT <p1><sg><pri> <Unhammer> v+ROOT+t <p1><pl><pri> <Unhammer> da+v+ROOT <p1><sg><fut> <Unhammer> da+v+ROOT+t <p1><pl><fut> <Unhammer> you might want to consider hfst ;) <spectie> yeah :D <jacobEo> so its much slower, I suppose <jimregan> nah <jimregan> slower to compile, sure <jimregan> you provide definitions for things like what is a vowel, what is a consonant, and where umlauting happens and what it is <jimregan> ...in nightmarish syntax that escaped from the 70s <spectie> http://paste2.org/p/532099 LowerG2 = [ [ (Cns:0) LCnsPhon7 (Cns:0) LCnsPhon (Cns:0) ! xy that cannot be G3, since x cannot form xy G3. | ! nijbe, [ ! This section is for 3-cons G2. and for 2cns G2 that share the initial cns with 2cns G3 (Cns:0) [:j|:l|:m|:n|:v] (Cns:0) :s :t ! S9, 3cns-G2 bäjstov, etc. | (Cns:0) [ :l | :r | :n | :j ] (Cns:0) :s :k ! S9, 3cns-G2 sválskes, etc. | (Cns:0) [ ! 2cns G2 that share the initial cns wit 2cns G3 :b (Cns:0) [ :d | :m | :j | :l | :n | :n :j | :r | :s | :t :j | :t :s ] ! S9, initial b | ! gábdev :d (Cns:0) [ :j | :n | :n :j ] ! S7, initial d | ! iednev :g (Cns:0) :ŋ ! S7, initial g | ! låg0ŋot :k (Cns:0) [ :n | :k ] ! S7, initial k | :g (Cns:0) :n !däggna:degna ] ! | :r (Cns:0) :s :j :t ! S9, rsjt, bårsjtav ] ] - [ [ d t [s|j] ] | b b | d d | g g | k [ s | t | t j | t s ] | f ':0 f | l ':0 l | m ':0 m | n ':0 n | n ':0 n j | ŋ ':0 ŋ | r ':0 r | s ':0 s | s ':0 s j | v ':0 v ] ]; <spectie> the formalism is human-hostile <Unhammer> ^^^ and sed-hostile <spectie> but really awesome... in the modern and biblical senses of the word :)