Google Summer of Code/Application 2013

From Apertium
Jump to navigation Jump to search

Application[edit]

1. Organisation name*

Apertium

2. Organisation description*

The Apertium project develops a free/open-source platform for machine translation and language technology. We try to focus our efforts on lesser-resourced and marginalised languages, but also work with more widely-spoken languages.

The platform, including data for a large number of language pairs, a translation engine and auxiliary tools is being developed around the world, largely in universities and companies (e.g. Prompsit Language Engineering), but independent free-software developers also play a huge role.

There are currently 33 published language pairs within the project (including a number of "firsts" — for example Aragonese—Spanish, Spanish—Occitan, Breton—French, and Basque—Spanish among others), and several more in development.

[GEMA] Apertium has a special focus in lowering the barrier for the creation of linguistic resources for any language, ideally to be used for MT, but also reusable for other purposes (e.g. grammar checking, morphological analysis, PoS tagging, etc.).

3. Organisation home page url*

http://www.apertium.org/

4. Main organisation license*

GNU General Public Licence

5. Veteran/New*

Veteran

6. Backup Admin*

We need to decide on one. I can be the backup admin if needed. Have we decided who'll be the main admin? --Mlforcada 15:18, 22 March 2013 (UTC)

7. If you chose "veteran" in the dropdown above, please summarise your involvement and the successes and challenges of your participation. Please also list your pass/fail rate for each year.

Apertium took part in GSoC in 2009, 2010, 2011 and 2012. We received 9 slots in 2009, 9 again in 2010, 11 in 2011, and 12 in 2012 although we gave one slot back to the pool, making 11. We are very happy with the results of our participation. Our main successes and challenges are described below:

Successes:

(This seems to refer only to GSoC 2012 am I right? --Mlforcada 15:18, 22 March 2013 (UTC))

  • Getting useful results: 9 out of 11 projects were successful in that they produced useful, working code, and 6 of the projects were released, which means that the code got to a sufficient level to be let into the world.
  • Getting maintainable results: 5 out of the 11 projects have had outside developers (e.g. not the students nor their mentors) work on them.
  • Attracting and keeping new developers: Out of our 11 GSOC students last year, 8 are still working with us, and 3 have become very regular committers. Several of our GSOC students last year also helped us out with mentoring for the GCI.
  • Selecting applicants: We continued refining our selection process, and found it worked even better in 2012 than in 2011.

Challenges:

  • Getting students to work quickly: Apertium is a fairly complex pipeline mixing programming knowledge with linguistic knowledge, getting started is not always straightforward and a special effort needs to be made to break the problems to be addressed by students into "chewable" pieces.
  • Getting the final furlong: Many of our GSOC projects were successful, in that the code worked, but they needed some finishing touches to be release-worthy. Encouraging students to do this proved in some cases difficult.
  • Persuading students to publicise their results, in 2009 we got around half of our students to present their work to the wider community, and in 2010 two (though two students who completed their projects outside of GSoC also presented their work), but some either didn't plan to have the time or we weren't persuasive enough. In 2011/2012 we had one student present their work.

Pass/fail rate by year: check these

  • 2009: 8 pass, 1 fail
  • 2010: 8 pass, 1 fail
  • 2011: 9 pass, 2 fail
  • 2012: 10 pass, 1 fail
8. Why is your organisation applying to participate in Google Summer of Code 2013? What do you hope to gain by participating?*

[GEMA]

Apertium is applying again for two main reasons:

  • Apertium likes Google Summer of Code: it is a programme that supports open-source as much as we do!
  • Apertium needs Google Summer of Code: it is an incredible opportunity for us to spread the word, to attract newcomers and to improve the platform

What we hope to gain by participating is more students getting to know open-source, contributing to open-source and, especially if they are passionate about languages and computers, contributing to Apertium.

9. What is the URL for your Ideas list?*

http://wiki.apertium.org/wiki/Ideas_for_Google_Summer_of_Code

10. What is the main development mailing list for your organisation?*

apertium-stuff@lists.sourceforge.net

11. What is the main IRC channel for your organisation?*

#apertium irc.freenode.net

12. What criteria did you use to select your mentors for this year's program? Please be as specific as possible.*
  • Active contributors: All of our mentors are active contributors to the project. Most of us know each other personally, either through meet ups, working together or conferences.

(Am I an active contributor? --Mlforcada 15:18, 22 March 2013 (UTC))

  • Knowledgeable in their field: Many of our mentors are university professors or PhD students or graduates. However, this is not enough to be considered for mentoring.
  • Enough time to spare: We ensure that our mentors have enough time to spare. Members of the project who have less than 5-10 hours/week to dedicate to their student are discouraged from applying to be a mentor.
  • Experience with mentoring: The majority of our mentors also have experience with mentoring (from past GSOCs), either they have been mentors, or in some cases, been mentored. Any new mentors are paired with an experienced mentor.
13. What is your plan for dealing with disappearing students?*

Students will be encouraged to let us know how they want to break up their time, and to plan for holidays and try and plan for other absences. This will avoid both mentors and students wasting time. If a mentor reports the unscheduled disappearance of a student (72-hour silence), they will be contacted by the administrators. If silence persists, their task will be frozen and we will report to Google.

14. What is your plan for dealing with disappearing mentors?*

It is quite unlikely, since all of the mentors are very active developers, with long-term commitment to the project — they are people we have met face-to-face at conferences, workshops or even in daily life.

However, there is always the possibility that some problem comes up, so we also assign backup mentors to all projects, and in many cases there are more than two mentors for a particular project.

If a mentor cannot continue for whatever reason, the backup/co-mentor will take over, and one of the organisation administrators will take on the role of second backup mentor.

15. What steps will you take to encourage students to interact with your project's community before and during the program?

First, we encourage all of our students visit our IRC channel (#apertium @ freenode) as often as possible, even before the start of the program, since that would help them find a suitable mentor and a useful project that they can work on. We advice them strongly to read our Wiki pages and manuals, use our system, try to break it and fix it, and finally tell us about it. As a result, students get familiar with Apertium before the coding period starts, which increases their chances of ending up with a successful project.

In addition, we define coding challenges for each of the proposed projects, which serve both as an entry task, and as means for getting our students familiar with Apertium and involved in our community in the early stages of the program.

Finally, during the coding stage, we talk to our students on a daily basis and give them suggestions and advice when they get stuck.

We urge them to keep to the project plan they made when applying, and assist them when they fall behind.

16. What will you do to encourage that your accepted students stick with the project after Google Summer of Code concludes?*

We have found that the following has helped us have quite a high retention rate in previous years:

  • Helping students out publishing papers for conferences, or assisting with academic work.
  • Organising a workshop (FreeRBMT) where students can present their work to the wider community
  • Encouraging students to get involved in mentoring themselves, through the GCI programme
  • Passing on information about MSc and PhD positions, and academic and other grants.
17. Are you an established or larger organisation who would like to vouch for a new organisation applying this year? If so, please list their name(s) here.