Difference between revisions of "Task ideas for Google Code-in"

From Apertium
Jump to navigation Jump to search
Line 32: Line 32:
 
| {{sc|code}} || SSL in [[apertium-apy]] || Make [[apertium-apy]] optionally use SSL. (If you put simple-html on an ssl domain, new browsers won't let you do plaintext/non-ssl ajax). || [[User:Firespeaker]] [[User:Unhammer]] [[User:Francis Tyers]]
 
| {{sc|code}} || SSL in [[apertium-apy]] || Make [[apertium-apy]] optionally use SSL. (If you put simple-html on an ssl domain, new browsers won't let you do plaintext/non-ssl ajax). || [[User:Firespeaker]] [[User:Unhammer]] [[User:Francis Tyers]]
 
|-
 
|-
  +
| {{sc|code}} || How much of a given sentence pair is explained by Apertium? || Write (in some scripting language of your choice) a command-line program that takes an Apertium language pair, a source-language sentence S, and a target-language sentence T, and outputs the set of pairs of subsegments (s,t) such that s is a subsegment of S, t a subsegment of T and t is the Apertium translation of s or vice-versa (a subsegment is a sequence of whole words). || [[User:Mlforcada]]
| {{sc|documentation}} || -|| - || -
 
  +
|-
 
| {{sc|documentation}} || -|| - ||-
 
|-
 
|-
 
| {{sc|documentation}} || -|| - || -
 
| {{sc|documentation}} || -|| - || -
Line 43: Line 45:
 
|-
 
|-
 
| {{sc|quality}} || Improve the quality of a language pair by adding entries to it|| Improve the quality of a language pair by (a) running a large amount of representative text through it, (b) determining the 30 most frequent unknown words and (c) adding them to the dictionaries so that they are not unknown anymore || [[User:Mlforcada]]
 
| {{sc|quality}} || Improve the quality of a language pair by adding entries to it|| Improve the quality of a language pair by (a) running a large amount of representative text through it, (b) determining the 30 most frequent unknown words and (c) adding them to the dictionaries so that they are not unknown anymore || [[User:Mlforcada]]
  +
|-
  +
| {{sc|quality}} || Improve the quality of a language pair by allowing for alternative translations || Improve the quality of a language pair by (a) detecting 5 cases where the (only) translation provided by the bilingual dictionary is not adequate in a given context, (b) adding the lexical selection module to the language, and (c) writing effective lexical selection rules to exploit that context to select a better translation || [[User:Francis Tyers]] [[User:Mlforcada]]
 
|-
 
|-
 
| {{sc|interface}} || Abstract the formatting for the [[simple-html]] interface. || The simple-html interface should be easily customisable so that people can make it look how they want. The task is to abstract the formatting and make one or more new stylesheets to change the appearance. || [[User:Francis Tyers]]
 
| {{sc|interface}} || Abstract the formatting for the [[simple-html]] interface. || The simple-html interface should be easily customisable so that people can make it look how they want. The task is to abstract the formatting and make one or more new stylesheets to change the appearance. || [[User:Francis Tyers]]

Revision as of 15:35, 10 October 2013

This is the task ideas page for Google Code-in (http://www.google-melange.com/gci/homepage/google/gci2013), here you can find ideas on interesting tasks that will improve your knowledge of Apertium and help you get into the world of open-source development.

The people column lists people who you should get in contact with to request further information. All tasks are 2 hours maximum estimated amount of time that would be spent on the task by an experienced developer, however:

  1. this does not include time taken to install / set up apertium.
  2. this is the time expected to take by an experienced developer, you may find that you spend more time on the task because of the learning curve.

Categories:

  • code: Tasks related to writing or refactoring code
  • documentation: Tasks related to creating/editing documents and helping others learn more
  • research: Tasks related to community management, outreach/marketting, or studying problems and recommending solutions
  • quality: Tasks related to testing and ensuring code is of high quality.
  • interface: Tasks related to user experience research or user interface design and interaction

Task list

Category Title Description Mentors
code Write 10 constraint grammar rules Write 10 new constraint grammar rules that resolve tagging problems in unseen text, and observe changes in the output, possibly after retraining the part-of-speech tagger . User:Mlforcada, User:Francis Tyers
code Start a constraint grammar rule file Start a constraint grammar rule file for a language pair not having it with 5 rules that resolve tagging problems in unseen text, and observe changes in the output, possibly after retraining the part-of-speech tagger . User:Mlforcada, User:Francis Tyers
code Localised available languages function in apertium-apy Make a new function for apertium-apy, is takes as input a language code, and as output gives the list of available pairs, and their translations in the language specified by the language code. You will probably need to know JavaScript and Python. User:Firespeaker User:Unhammer User:Francis Tyers
code Fix the highlighting in simple-html language selection boxes. The language selection box in the simple-html interface should highlight the supported target languages when a user clicks on a source language. At the moment this does not work properly. For this task you will need to know Javascript. User:Firespeaker User:Francis Tyers
code Language detection in apertium-apy Make a new function for apertium-apy, that allows the language of some input text to be identified. For this task you will also need to train models for the language identifier. User:Firespeaker User:Unhammer User:Francis Tyers
code SSL in apertium-apy Make apertium-apy optionally use SSL. (If you put simple-html on an ssl domain, new browsers won't let you do plaintext/non-ssl ajax). User:Firespeaker User:Unhammer User:Francis Tyers
code How much of a given sentence pair is explained by Apertium? Write (in some scripting language of your choice) a command-line program that takes an Apertium language pair, a source-language sentence S, and a target-language sentence T, and outputs the set of pairs of subsegments (s,t) such that s is a subsegment of S, t a subsegment of T and t is the Apertium translation of s or vice-versa (a subsegment is a sequence of whole words). User:Mlforcada
documentation - - -
documentation - - -
research - - -
research and documentation The most frequent Romance-to-Romance transfer rules Study the .t1x transfer rule files of Romance language pairs and distill 5-10 common rules that are common to all of them, perhaps by rewriting them into some equivalent form User:Mlforcada
quality - - -
quality Improve the quality of a language pair by adding entries to it Improve the quality of a language pair by (a) running a large amount of representative text through it, (b) determining the 30 most frequent unknown words and (c) adding them to the dictionaries so that they are not unknown anymore User:Mlforcada
quality Improve the quality of a language pair by allowing for alternative translations Improve the quality of a language pair by (a) detecting 5 cases where the (only) translation provided by the bilingual dictionary is not adequate in a given context, (b) adding the lexical selection module to the language, and (c) writing effective lexical selection rules to exploit that context to select a better translation User:Francis Tyers User:Mlforcada
interface Abstract the formatting for the simple-html interface. The simple-html interface should be easily customisable so that people can make it look how they want. The task is to abstract the formatting and make one or more new stylesheets to change the appearance. User:Francis Tyers