Northern Sámi and Finnish/Completed tasks

From Apertium
< Northern Sámi and Finnish
Revision as of 14:07, 19 June 2010 by Francis Tyers (talk | contribs) (Created page with '* <s>Adding subcategories (Dem, Itg, etc.) to pronouns in Omorfi</s> * <s>Fred Karlsson's constraint grammar for Finnish has been GPL'd, and is available and undergoing conversi…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
  • Adding subcategories (Dem, Itg, etc.) to pronouns in Omorfi
  • Fred Karlsson's constraint grammar for Finnish has been GPL'd, and is available and undergoing conversion to CG3 here: https://victorio.uit.no/langtech/trunk/kt/fin/src
    • This should be converted in an Apertium-compatible manner from the start! No using reserved symbols (e.g. <, > and /)
  • How can we restrict generation of alternative forms in the Sámi generator ? In lttoolbox this is done with LR (only analyse)/RL (only generate) markings.
    • As follows: The RL forms should be marked as such in the source code. The tag for it is +Use/NG. All forms given this tag will be included in the analyser sme.fst but excluded from the generator isme.fst


  • hfst-lookup or something similar to _generate_ analyses that come in with ^ and $
  • Can we rig up SVN to pull in the twol file from GT svn directly ?
  • Some tags do not get replaced by the relabel script: olleet olla+V[GEN=ACT]+Pcp1+Pos+Pl+Nom
    • This should be taken care of. Further problems might be due to a missing Multichar_symbol in the omorfi.hlexc file. -- Francis Tyers
  • Sub-categorise conjunctions into CC/CS ?


  • Generation with correct case. At the moment the North Sámi generator cannot generate words with initial caps.
  • Syntax tags should not use > and <, until these are replaced, the translator should not run the syntax section of the CG (section #5). See modes.xml file.
    • Replaced. To use with the rest of the GT toolchain, use sed 's/→/>/g' | sed 's/←/</g' -- Francis Tyers