Google Code-in/Application 2014
- Organisation id*
- Organisation name*
- The Apertium project
- Organisation description*
Apertium develops a free/open-source platform for machine translation and language technology. We try and focus our efforts on lesser-resourced and marginalised languages, but also work with larger languages. The platform, including data for a large number of language pairs, a translation engine and auxiliary tools is being developed around the world, both in universities and companies (e.g. Prompsit Language Engineering) and by a growing numbers independent free-software developers. There are currently 40 published language pairs within the project (including a number of "firsts" — for example Spanish—Occitan, Breton—French, Basque—Spanish, North Sámi--Norwegian Bokmål and Kazakh-Tatar among others), and many more in development.
nlp, mt, translation, grammar, python, c++, linguistics, languages
- Organisation home page url*
- Main organisation license*
- GNU GPL 2.0/3.0
- Backup Admin*
- Why would you organisation like to participate in Google Code-in 2014?*
Apertium is really keen on participating in Google Code-in for the following four reasons:
In previous years we have really benefitted from GCI. In any free software project, there are often tasks that get pushed to the bottom of a developer's todo list, but aren't big enough for a GSOC project. We have found GCI students immensely good at helping us out with these: for instance, annotating corpora that are needed to train Apertium modules, or finding bugs in the handling of formatting, which lead to broken document translation. There have also been GCI projects which have become crucial pieces of code -- for example Nathan Maxon's Kazakh analyser, which went on to be key to developing the first Kazakh-Tatar MT system, and Pim Otte's Afrikaans-Dutch system, which he presented at an international MT conference.
As Apertium is a project that focusses a lot on marginalised languages, GCI gives us a chance of getting in touch with the next generation of speakers, and showing them how they can help their languages develop and give them some esteem. Language shift (abandoning one's language after perceiving it is not useful for wider spheres of communication) often occurs at this age, and if we can show them that their language is useful, and other people care, and there is no barrier for its use in the 'electronic' space then that might give it more chance of survival.
Getting kids involved early in Apertium also ensures a flux of new developers for the project, but most importantly, reinforces one of the main tenets of what is sometimes called Responsible Research and Innovation: successful development has to involve society — Apertium development has too. And teenagers are a particularly active part of the digitally active society.
Finally, teaching the kids is just really rewarding. Helping them out, answering questions, explaining things, and when they get it, it's like a spark goes off, and even if it has taken a long time to explain, it's a really great feeling. In fact, some of our mentors are University instructors who unfortunately find too often that university students are not too good at programming and that teaching them is hard and unrewarding; working with pre-university people may motivate them in the face of such a hard task but also give them clues on how to teach programming at the university level.
- What years has your organisation participated in Google Summer of Code? Please indicate the years you have participated in Google Code-in or GHOP if applicable.*
2009, 2010*, 2011*, 2012*, 2013*, 2014*
- Please provide a link to your tasks page. This is one of the most important parts of your application as it lets us see what type of work you plan to have the students work on for Google Code-in and shows you already have some ideas of the types of tasks students would work on. Please be sure to include at least 4 tasks from each of the 5 categories. This is similar to the Google Summer of Code Ideas page. *
- This page needs to be updated!
- What programming languages does your organisation use?*
- What is the main development mailing list for your organisation? This question will be shown to students who would like to get more information about applying to your organisation for Google Code-in 2014. If your organisation uses more than one list, please make sure to include a description of the list so students know which to use.*
- email@example.com (general list: most traffic here)
- What is the main IRC channel for your organisation?*
- #apertium on irc.freenode.net
- Please tell us about how your organisation has prepared for Google Code-in, including what (and how many) mentors and organisation administrators have agreed to help, what your schedule and response time will be during the holidays (and otherwise during the contest period) and how you plan to deal with unresponsive mentors.*
We have four organisation administrators: Francis Tyers, Jonathan North Washington, Mikel L. Forcada and Kevin Brubeck Unhammer.
We have around 10 mentors who will be taking part. They are from a variety of time zones, from CST (UTC-6) to MSK (UTC+4).
In addition to these mentors, there will be plenty of help available to students as there are always Apertium developers hanging out on the Apertium IRC channel. For most of our mentors, hanging out on the Apertium IRC list, hacking, and helping other developers hack is a lot of what we do in our free time, because we do it for fun. For those of us that work, the 'holidays' are really when we are most active in Apertium. In past Google Code-Ins our organisation has had no problem to respond in time to students.
If for some reason a mentor becomes unresponsive (in our experience, it would have to be either task overload or 'force majeure'!), administrators will be on call to reassign the task to another mentor or evaluate it themselves.