Task ideas for Google Code-in/Add lexical-select rules

From Apertium
< Task ideas for Google Code-in
Revision as of 14:15, 29 October 2013 by Francis Tyers (talk | contribs) (Created page with '# select a language pair that is already set up for lexical selection, ideally such that the source language is a language you know (L₂) and the target language a language you …')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
  1. select a language pair that is already set up for lexical selection, ideally such that the source language is a language you know (L₂) and the target language a language you use every day (L₁).
  2. Install Apertium locally from the Subversion repository; install the language pair; make sure that it works and/or getApertium VirtualBox and update, check out & compile the language pair.
  3. Using a large enough corpus of the source language (e.g. plain text taken from Wikipedia, newspapers, literature, etc.), detect cases of inadequate lexical choice, that is, the translation is grammatical but the translation selected for one word is not correct (because the source word is polysemous or has more than one meaning).
  4. Add entries to the bilingual dictionary if needed and write 10 lexical selection rules that select the correct translation in the relevant context.
  5. Compile and test again.
  6. Submit a patch to your mentor (or commit it if you have already gained developer access).