Apertium has moved from SourceForge to GitHub.
If you have any questions, please come and talk to us on #apertium on irc.freenode.net or contact the GitHub migration team.

Ideas for Google Summer of Code/Robust tokenisation

From Apertium
Jump to: navigation, search


[edit] Task

  • Update lttoolbox to be fully Unicode compliant with regards to alphabetical symbols.


[edit] Coding challenge

  • Remove all multiwords from an Apertium language pair and put them in an apertium-separable dictionary.
  • Make sure that the output before/after is identical.

[edit] Further readings

Personal tools