Ideas for Google Summer of Code/Spell checking

From Apertium
< Ideas for Google Summer of Code
Revision as of 13:47, 17 March 2016 by Tino Didriksen (talk | contribs) (→‎Tasks)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Apertium hfst transducers can be compiled into libraries that libvoikko can use to perform spell checking, including providing suggestions. Our lttoolbox based transducers should be usable in the same way. Additionally, we have the beginnings of a spell checking interface developed for our website framework.

This project involves finishing up this work, so that the input box on apertium.org does server-based spell checking, and so that all apertium transducers can be used to make spell checkers for other projects.

Tasks

  • make sure that both hfst and lttoolbox transducers can be compiled to zhfst spellers, dynamically generating the error model where none exists
  • create clean Makefile rules for speller compilation that are usable in our monolingual modules
  • determine the best way to make spell-checking integrated into apertium, both as a configurable option to compile, and as an apertium mode (integrate this into apertium-init)
  • make Apertium-apy support this type of mode
  • finish the web interface for apertium-html-tools for spell checking and make it interface with APY


  • make spell checker modules easily compilable for other platforms that do spell checking (minimally, OpenOffice, which can be convinced to use voikko modules if you ask nicely, but also potentially ispell or aspell or OS X's spelling engine)

Coding challenges

  1. Compile a speller for one of our hfst-based analysers and run it on some text
  2. Compile a speller for one of our lttoolbox-based analysers and run it on some text
  3. Create a libvoikko speller from one of our hfst-based analysers, test it in libreoffice
  4. Create a libvoikko speller from one of our lttoolbox-based analysers, test it in libreoffice

See also