Difference between revisions of "Apertium Turkic/TODO"

Revision as of 04:52, 13 January 2014

This section outlines what's left to get http://turkic.apertium.com/ up and running.

~~Get apertium-apy working stably~~
merge simple-html and html-tools so that simple-html can be automatically extracted from html-tools
~~apache forwarding for html-tools~~ (unnecessary!)
init scripts and cron testers for apertium-html-tools, gateway, and apertium-apy
- find some way to have it retry restarting if it fails because the port is still reserved by the OS

make the following pairs available to the site:

pairs: kaz-tat, tur-kir, kaz-kir, tat-bak, kaz-kaa, tuk-tur?, tur-uzb?, kaz-eng?
transducers: kaz, tat, kir, tur, bak, chv, kum, nog, kaa, uzb?, tuk?

~~localised language names in analysis, generation, and spell-check modes~~
~~get a working theme together~~
make sandbox mode disabled unless an appropriate switch is passed to apertium-html-tools
~~add a note (localised to various languages) along the lines of "Found a mistake? Help us fix it!" with link to Apertium Turkic~~

consider including the web concordancer on the site (and consider what corpora to provide search access to...)

~~How can we count lexc stems effectively? - JNW's bash script can be generalised (and rewritten in python), and it'll come close~~ see The Right Way to count lexc stems

How can we do single-category testvoc now?
How can we make vanilla transducers (without MT-specific "wrong" POSes)
- The problem is that "! Use/xxx-yyy" lines can't just be grepped out in the vanilla transducer anymore, since those are needed for the xxx-yyy transducers. That is, we're no longer just copying the lexc file, but copying the full transducer (no trimming before compilation), and trimming the transducer directly (based on the bidix) for use in pairs.
How can we count trimmed stems?

@@ Line 34: / Line 34: @@
 == Things that need to be figured out ==
-* [http://www.google-melange.com/gci/task/view/google/gci2013/5872152972623872 How can we count lexc stems effectively?] - JNW's bash script can be generalised (and rewritten in python), and it'll come close
+* <s>[http://www.google-melange.com/gci/task/view/google/gci2013/5872152972623872 How can we count lexc stems effectively?] - JNW's bash script can be generalised (and rewritten in python), and it'll come close</s> see [[The_Right_Way_to_count_lexc_stems|The Right Way to count lexc stems]]
 === Issues introduced by new build process ===