Task ideas for Google Code-in

From Apertium
Revision as of 21:32, 18 October 2010 by Francis Tyers (talk | contribs)
Jump to navigation Jump to search

An informal spot for outlining ideas for the Google Code-in (GCI).

  1. Code Take two language pairs, use apertium-crossdics and clean up the resulting bilingual dictionary. For instance, build Occitan-French from Occitan-Catalan and Catalan-French.
  2. Code Convert an existing resource into Apertium format, for example an analyser for Punjabi or Hindi.
  3. Documentation Document features used in language pairs but not documented in the current official documentation or wiki (for instance, cascaded interchunk transfer); integrate that into the existing "official " documentation.
  4. Outreach Writing a quick guide on 'What Apertium can and cannot do to help you with your homework'.
  5. Quality Assurance Perform a human post-editting evaluation of one of our non-evaluated pairs. At least 5,000 words.
  6. Quality Assurance Make some concrete improvements in a language pair. This might be disambiguation, transfer or vocabulary (in particular, minor-major language pairs would appreciate input: Welsh-English, Basque-Spanish, Breton-French).
  7. Research Pick an under-resourced language, and go and find as many free resources for it as possible. This could include grammatical/morphological descriptions, dictionaries, anything. Catalogue them in the Incubator.
  8. Training Writing up a simple step-by-step guide (on the wiki) for pre-university students (of varying levels of computer literacy) to install a development version of Apertium and start doing development or polishing tasks like the ones above, to become a young Apertium developer. This may reuse or link existing material.
  9. Translation Translate the new language pair HOWTO — and in the process commit the translation system you make to the Incubator
  10. User interface Update apertium-tolk and apertium-dbus