General to-do list
Jump to navigation Jump to search
Things to be done or not depending on if we get time. Incorporates stuff in the feature request thing on SourceForge.
- Unicode!! (lttoolbox and apertium already converted, except for the part-of-speech tagger training tools)
- .po -> .tmx convertor - please see po2tmx
- A module that applies .tmx to a stream. (Turn TMX to .dix and compile?) — see tmx2dix
- A webservice using SOAP or XML-RPC for apertium translations. (from friedel)
- De-/re-formatter for LaTeX (but see traducíndote, Óscar seems to have code available, but in PHP; we would have to write an Apertium de-/re-formatter).
- De-/re- formatter for Wiki code (as in Wikipedia formatting) (formatters can be specified in XML).
- Would be nice to have python bindings for
- Translation quality evaluation -- with humans, not metrics. See the
apertium-eval-translatorpackage, and an alternative module in the project's SVN. Both compute the WER (word error rate) by comparing a raw Apertium translation and its postedited version.
- Support for languages which prefix or infix inflection (currently only suffix inflection)
- Functions in lttoolbox and apertium to allow for the translation of strings, rather than file streams. This would make it easier to incorporate apertium into other software (e.g. gtranslator or something). [The bilingual dictionary is called that way by the structural transfer, so this may be already available]
- Make an Apertium plugin for Openoffice.org.
- Make an Apertium plugin for Firefox.
- Make an interface to add new words in monodix, bidix, or new paradigms.
- Write regular expressions for URLs like http://www.dlsi.ua.es or www.pujol.com so that they do not get translated (or write deformatter code for them); the code for these could be used in any language.