Html-tools/GCI suggests and stuff
This page is for GCi (GSOC?) etc. ideas for functionalities related to "Suggest a better translation" kind of features to html-tools.
- Unknown words
When you translate stuff with http://apertium.org/ some translations contain words with stars like:
These are words and these nonwords foo. -> Estos son palabras y estos *nonwords *foo.
These words are missing from somewhere in apertium. Currently the web page kind of collects some of these. I would want them to become links to a page that has a form where you can enter details about it.
Things to do:
- . The most basic implementation can be super simple with few text boxes and record or mail the form to wherever.
- . It is possible to find out where the word is missing from: e.g. is this a word not in English (monolingual) dictionary or English-Spanish (bilingual) dictionary. Finding out which is which may require working with APY.
- . In http://apertium.org there are only *words I think, it is possible to get @words and #words too, try http://turkic.apertium.org, I personally would like these to be different clicky links to other forms (see this doc for what the symbols mean)
- . When the words are missing in a monolingual dictionary, it may be required for user to classify them before they can be used. With some languages the classification it's enough to say if it's a noun, verb, adjective or something else, but with others you need to know *inflectional pattern* or *paradigm*. To help this classification there are web apps like https://github.com/flammie/paradigm.abumatran.eu that could be integrated to these links (this task requires some research work as well).
- Suggest a better translation features are known to be prone to vandalism, user signup / login / authentication stuff may be needed. Also for copyright reasons.
Experiences from an earlier experiment: http://permalink.gmane.org/gmane.comp.nlp.apertium/5478
http://wiki.apertium.org/wiki/User:Dtr5 a larger, previous implementation