Difference between revisions of "Ideas for Google Summer of Code/Closer integration with HFST"
Jump to navigation
Jump to search
(→Tasks) |
|||
(9 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
{{TOCD}} |
{{TOCD}} |
||
+ | |||
+ | This is a set of subtasks to make it easier for Apertium developers to use the Helsinki Finite-State Toolkit (HFST). HFST is a great toolkit for working with morphological transducers, but it is pretty difficult to install, and also not very well integrated with Apertium / doesn't really follow the Apertium way of doing things. We'd like to make it more closely integrated. |
||
==Tasks== |
==Tasks== |
||
− | * Create a new XML-based format for [[lexc]] inspired by [[lttoolbox]] |
+ | * Create a new XML-based format for [[lexc]] inspired by [[lttoolbox]] (see [[Development ideas for dictionary format]]) |
+ | * Add a compiler for this format, with support for direction restrictions. |
||
− | * Fix [ |
+ | * Fix [https://sourceforge.net/p/hfst/bugs/153/ this bug] in <code>hfst-proc</code> tokenisation. |
* Modify the HFST build process to make a "minimal" Apertium-centred install. |
* Modify the HFST build process to make a "minimal" Apertium-centred install. |
||
* Add [[lttoolbox]] as a backend to HFST. |
* Add [[lttoolbox]] as a backend to HFST. |
||
Line 10: | Line 13: | ||
==Coding challenge== |
==Coding challenge== |
||
+ | |||
+ | * Install [[Apertium]] and [[HFST]] |
||
+ | * Install a language pair which uses both Apertium and HFST. |
||
==Frequently asked questions== |
==Frequently asked questions== |
||
+ | * none yet, ''[[contact|ask us]] something!'' :) |
||
+ | ==See also== |
||
− | ==Previous GSOC projects== |
||
Latest revision as of 23:58, 5 April 2013
This is a set of subtasks to make it easier for Apertium developers to use the Helsinki Finite-State Toolkit (HFST). HFST is a great toolkit for working with morphological transducers, but it is pretty difficult to install, and also not very well integrated with Apertium / doesn't really follow the Apertium way of doing things. We'd like to make it more closely integrated.
Tasks[edit]
- Create a new XML-based format for lexc inspired by lttoolbox (see Development ideas for dictionary format)
- Add a compiler for this format, with support for direction restrictions.
- Fix this bug in
hfst-proc
tokenisation. - Modify the HFST build process to make a "minimal" Apertium-centred install.
- Add lttoolbox as a backend to HFST.
- Make
hfst-expand
obey flag diacritics.
Coding challenge[edit]
Frequently asked questions[edit]
- none yet, ask us something! :)