Task ideas for Google Code-in
Contents |
This is the task ideas page for Google Code-in, here you can find ideas on interesting tasks that will improve your knowledge of Apertium and help you get into the world of open-source development.
The people column lists people who you should get in contact with to request further information. All tasks are 2 hours maximum estimated amount of time that would be spent on the task by an experienced developer, however:
- this does not include time taken to install / set up apertium.
- this is the time expected to take by an experienced developer, you may find that you spend more time on the task because of the learning curve.
Categories:
- code: Tasks related to writing or refactoring code
- documentation: Tasks related to creating/editing documents and helping others learn more
- research: Tasks related to community management, outreach/marketting, or studying problems and recommending solutions
- quality: Tasks related to testing and ensuring code is of high quality.
- interface: Tasks related to user experience research or user interface design and interaction
You can find descriptions of some of the mentors here: List_of_Apertium_mentors.
Task ideas
| type | title | description | tags | mentors | |||
|---|---|---|---|---|---|---|---|
| code | Fix a memory leak in matxin-transfer | hargle bargle | c++ | Fran | |||
| research | See if you can precompile xpath expressions or xslt stylesheets | hargle bargle | parsing | Fran | |||
| research | Review literature on linearisation of dependency trees | hargle bargle | parsing | Fran | |||
| Tag text in Apertium format | Fran | ||||||
| code | Convert Chukchi Nouns to HFST/lexc | Fran | |||||
| code | Convert Chukchi Numerals to HFST/lexc | Fran | |||||
| code | Convert Chukchi Adjectives to HFST/lexc | Fran | |||||
| Make a (web) viewer for parallel treebanks (also for viewing diff annotation for same sentence) | |||||||
| Write a script to convert a UD treebank for a given language to a format suitable for training the perceptron tagger | |||||||
| Train the perceptron tagger for a language | Fran | ||||||
| Design an annotation tool for disambiguation | |||||||
| Design an annotation tool for adding dependencies | |||||||
| Train lexical selection rules from a large parallel corpus for a language pair | Fran | ||||||
| Document how to set up the experiments for weighted transfer rules | Fran | ||||||
| convert UD treebank to apertium tags, use unigram tagger (see #apertium logs 2016-06-22) | |||||||
| Write a script to extract sentences from CoNLL-U where they have the same tokenisation as Apertium. | Fran | ||||||
| convert [1] to apertium-style documentation | |||||||
| code | Implement `lt-print --strings` lt-print -s | c++ | Fran | ||||
| code | Implement an algorithm that prints out a transducer but only follows n cycles. | c++ | Fran | ||||
| code | in-browser globe with apertium languages as points | Use d3 globe to make an apertium language/pair viewer (like pairviewer), maybe based on this or this or this. This file contains coordinates of Apertium languages. | js,html,maps | Firespeaker | |||
| make a thing to detect contexts where a path in a compiled transducer begins with a whitespace | |||||||
| make the lt-comp compiler print a warning when a path begins with a whitespace. | |||||||
| apertium-mar-hin: make the TL morph for any part of speech less daft | morphology | vin-ivar | |||||
| add other indic scripts/formal latin transliterations to the currently-just-WX transliterator | python | vin-ivar | |||||
| apertium-hin: more consistency with apertium-mar for verbs | morphology | vin-ivar | |||||
| apertium-mar: replace cases with postpositions | morphology | vin-ivar | |||||
| apertium-mar: fix modals and quasi-modals | morphology | vin-ivar | |||||
| code | refactor x file in apy | Putti | |||||
| documentation | add docstrings to x file in apy | Putti | |||||
| code, quality | write 10 unit tests for apy | Putti, (sushain, unhammer ?) | |||||
| add 1 transfer rule | Fran, vinit | ||||||
| add 50 entries to a bidix | Fran, vinit | ||||||
| write 10 lexical selection rules | Write 10 lexical selection rules for a pair already set up with lexical selection | Fran, vinit | |||||
| write 10 constraint grammar rules | Fran, vinit | ||||||
| research, documentation | Document resources for a language | Document resources for a language without resources already documented on the wiki | Firespeaker | ||||
| code | apertium-hun: convert hunmorph.db into one of: (lexc Root lexicon, monodix blehs, LMF) | Flammie | |||||
| code | apertium-hun: match existing apertium-hun paradigms with morphdb.hu) | Flammie | |||||
| code | apertium-fin-eng: go through lexicon for potential rubbish words) | Flammie | |||||
| code | apertium-fin-eng: add words from apertium-fin-eng to apertium-eng (minor classification required)) | Flammie | |||||
| code | apertium-apy: add CLARIN compatible i/o formats) | Flammie | |||||
| code | apertium-apy: add CLARIN compatible i/o formats) | Flammie | |||||
| code | apertium-apy: write and deploy some CMDI stuff) | Flammie | |||||
| code | apertium-apy: make more parts of apertium-pipeline available through API (e.g. disambig, etc.) ) | Flammie | |||||
| code | Deploy suggest-a-word feature in apertium.org ) | Flammie | |||||
| code | Further developments to suggest a word ) | Flammie | |||||
| code | Fix ordering of dependencies in CG matxin format | Fran |