Task ideas for Google Code-in/Getting started

From Apertium
Jump to navigation Jump to search

This page is out of date as a result of the migration to GitHub. Please update this page with new documentation and remove this warning. If you are unsure how to proceed, please contact the GitHub migration team.

This page will describe some steps you can take to get involved with the Apertium project in the Google Code-in. First of all, thanks for reading! We're very enthusiastic about getting new contributors to Apertium and to helping spread our passion for language technology.

First steps

So, what are the first steps ?

  • Talk to us! / Get on IRC! This is the most important step! Nothing in Apertium is too hard without the right amount of help. And we like helping, so just get in contact. The best way to contact us is on IRC, and the best way to use IRC is with a client like irssi,[1] weechat[2] hexchat[3] or LimeChat[4]. A good tip is to hang out on IRC, even if no-one is talking when you enter. People can be in different time zones, and channel activity peaks depending on the time.
Here's a list of the IRC nicks and wiki usernames of some of the mentors who are regulars on IRC:
GCI name IRC nick wiki username Email address
Jonathan W firespeaker, jonorthwash Firespeaker jonathan.n.washington@gmail.com
Francis Tyers spectie, spectei, spectre Francis Tyers francis.tyers@gmail.com
Maria Shejanova maryszmary Masha masha.shejanova@gmail.com
Aida Sundetova aida27 Aida ?
Kevin Brubeck Unhammer Unhammer Unhammer unhammer+apertium@mm.st
Vinit Ravishankar vin-ivar Vin-ivar
Memduh Gökırmak fotonzade memduhg@gmail.com
Sushain Cherivirala sushain, sushain97 Sushain sushain97@gmail.com
Xavi Ivars xavivars Xavi Ivars xavi.ivars@gmail.com
Irene Tang irene_ Irene irenetang14@gmail.com
Shardul Chiplunkar shardulc[5] Shardulc shardul.chiplunkar@gmail.com
Anna Kondratjeva deltamachine deltamachine an-an-kondratjeva@yandex.ru
Vinay Singh SilentFlame SilentFlame csvinay.d@gmail.com
Jaipal Singh Goud Schindler Schindler jpsinghgoud@gmail.com
Matthew Marting m5w[5], m5w_ M5w matthew.marting.1@gmail.com
Tommi Pirinen Flammie ffflammie@gmail.com
Inari Listenmaa inariksit Inariksit ?
Marc Riera mrieratrad Marc Riera marc.riera.irigoyen@gmail.com
Ng Wei En wei2912 Wei En weien1292+gci@gmail.com
Marina Kustova edgeandpearl edgeandpearl marinakoustova@gmail.com
Anastasia Kuznetsova anakuz Anakuznetsova menina.indigena.17@gmail.com
Ngadou Yopa math-alpha, m-alpha Ngadou Sylvestre yopasylvestre@gmail.com

  • Install Apertium: Not all tasks require Apertium to be installed, but if you're planning to work with Apertium, it's a good idea to do this early.
  • Find an interesting task:

Useful guidelines

Things you might want to know.


For some tasks, you may need access to Apertium resources, like the wiki or our subversion repository. Usually this is no problem—you just need ask a mentor or an org admin (ask on IRC above).

For an account on the wiki, we'll need an email address and your preferred username. When we create an account for you, you'll receive an email with a temporary password that should change when you first log in.

Tasks on github

For tasks relating to code on github (e.g., begiak, APy, and html-tools), you just need to clone the relevant repository, make your changes, and submit a pull request.

"Fix any bug" tasks

For tasks that point you at a repository and ask you to fix any bug, you should decide on a bug and tell your mentor which one you want to work on when you claim the task. You are also encouraged to come onto IRC (see above) and ask which bug might be a good one to work on given your background—i.e., discussing it with a mentor ahead of time.

Where is apertium code?

Apertium code is housed in several places:

  • Most code, including the core tools, translation and language modules, and a number of other things, live in our svn repo. The language data is found in the following places:
    • /languages - where stable monolingual language packages live
    • /incubator - where the initial stages of language data development takes place, and sometimes stagnates
    • /nursery - where translation modules that have begun to become useful/usable live
    • /staging - where translation modules that are nearly ready—but are still not quite ready for production-environment use—live
    • /trunk - where translation modules that are fully developed and considered stable live; also here is the main code base, etc.
  • Many tools are also in svn, specifically /trunk/apertium-tools.
  • Several tools live on GitHub, including begiak (our IRC bot), APy (our web API), and html-tools (our website framework). The latter two of these are synchronised back into SVN (in /trunk/apertium-tools), but the main development for all three occurs on GitHub.

Language and translation modules

  • Most translation modules are structured in the form of apertium-xxx-yyy, meaning it's a module that translates from language xxx to langauge yyy (and potentially the other way around).
    • Some older language modules use two letter abbreviations, like apertium-xx-yy, but the standard now is three-letter
    • Monolingual language modules are named apertium-xxx, where xxx is the ISO 639-3 code for the language
    • All but some older translation modules rely on monolingual language modules
  • Some monolingual language modules are based on HFST, and some are based on lttoolbox.
  • You can install pre-compiled language and translation modules for end-user use from our package repositories, but if you'd like to work on the data, you need to download the relevant one(s) and compile it/them yourself.
  • You can install pre-compiled core tools from our package repositories for end-user use or for developing language modules, but if you'd like to work on a particular tool, you need to download and compile it yourself.


  1. https://irssi.org/
  2. https://weechat.org/
  3. https://hexchat.github.io/
  4. https://itunes.apple.com/us/app/limechat/id414030210?mt=12
  5. 5.0 5.1 These IRC nicks each mirror a Matrix user. If you would like to send a PM to one of these users, you will need to register your nick with NickServ. See here for more information.