I'm writing a toy translator for English->Finnish. I decided to implement all of the phonological changes in lextools, which basically makes FSTs out of rewrite rules. One of those phonological changes is vowel harmony. My translator outputs words with a '~' before each ending, then the output goes to the FST grammar (1 word at a time). For example: -ssa/-ssä means 'in', so I represent it as follows: ~ssA. So if I give my translator in the house, it outputs
- 'talo~ssA#N' (talo means house, 'the' disappears)
# indicates word boundary, N shows that it should go to the noun FST (the verb one takes care of conjugating verbs correctly, given the endings). The output is then talossa, which is correct.
Realistically, I don't think it's practical/desirable to write phonological grammars in XML, so it would be good to develop a rewrite rule grammar -> FST conversion program which would be tightly integrated with the rest of Apertium. Using other packages works, but it's a pain because of text encoding issues and so on. —Preceding unsigned comment added by Morleye (talk • contribs)
- Hi, wow, great work. For Finnish and other languages with complex morpho-phonology, I would myself recommend something along the lines of foma or hfst. I am working on making these both tightly integrated with Apertium (as was done e.g. with constraint grammar) If you want to continue with lttoolbox, we can try and come up with something, stop by on irc and we can talk about it. - Francis Tyers 08:09, 23 November 2009 (UTC)