Ideas for Google Summer of Code/Closer integration with HFST
< Ideas for Google Summer of Code
		
		
		
		
		Jump to navigation
		Jump to search
		Revision as of 14:45, 14 March 2013 by Francis Tyers (talk | contribs)
This is a set of subtasks to make it easier for Apertium developers to use the Helsinki Finite-State Toolkit (HFST). HFST is a great toolkit for working with morphological transducers, but it is pretty difficult to install, and also not very well integrated with Apertium / doesn't really follow the Apertium way of doing things. We'd like to make it more closely integrated.
Tasks
- Create a new XML-based format for lexc inspired by lttoolbox (see Development ideas for dictionary format)
- Add a compiler for this format, with support for direction restrictions.
- Fix this bug in hfst-proctokenisation.- the link says it's fixed, is it? (or is it that we want ^al/*al$ ^žaktare/*žaktare$instead of^al žaktare/*al žaktare$?)- yes, we want the same behaviour as lttoolbox.
 
 
- the link says it's fixed, is it? (or is it that we want 
- Modify the HFST build process to make a "minimal" Apertium-centred install.
- Add lttoolbox as a backend to HFST.
- Make hfst-expandobey flag diacritics.

