Apertium Turkic/TODO
< Apertium Turkic
Jump to navigation
Jump to search
Revision as of 09:21, 3 January 2014 by Firespeaker (talk | contribs)
This is a general to-do list for the Apertium Turkic working group.
Website
Get http://turkic.apertium.com/ up and running.
software infrastructure
Get apertium-apy working stably- merge simple-html and html-tools so that simple-html can be automatically extracted from html-tools
- apache forwarding for html-tools
- init scripts and cron testers for apertium-html-tools, gateway, and apertium-apy
- find some way to have it retry restarting if it fails because the port is still reserved by the OS
optional: spell checker and language detection stuff
- spell checking mode in apertium-apy
- integrate spell checker interface into html-tools
- get language detection interface working (in progress)
- language detection mode in apertium-apy (prototype done)
what to include
make the following pairs available to the site:
- pairs: kaz-tat, tur-kir, kaz-kir, tat-bak, kaz-kaa, tuk-tur?, tur-uzb?
- transducers: kaz, tat, kir, tur, bak, chv, kum, nog, kaa, uzb?, tuk?
prettifying
- localised language names in analysis and generation
- add a note (localised to various languages) along the lines of "Found a mistake? Help us fix it!" with link to Apertium Turkic
Things that need to be figured out
- How can we count lexc stems effectively? - JNW's bash script can be generalised (and rewritten in python), and it'll come close
Issues introduced by new build process
- How can we do single-category testvoc now?
- How can we make vanilla transducers (without MT-specific "wrong" POSes)
- How can we count trimmed stems?