Google Summer of Code/Application 2021

From Apertium
Jump to navigation Jump to search

Org Profile[edit]

Website URL[edit]

http://wiki.apertium.org

Tagline[edit]

A free/open-source machine translation platform

[edit]

https://upload.wikimedia.org/wikipedia/commons/thumb/b/b4/Apertium_logo.svg/1214px-Apertium_logo.svg.png

Primary Open Source License[edit]

GNU General Public License version 3

Organization Category[edit]

Technology Tags[edit]

C++ python bash XML javascript

Topic Tags[edit]

machine translation natural language processing less-resourced languages

Ideas List[edit]

http://wiki.apertium.org/wiki/Ideas_for_Google_Summer_of_Code

Short Description[edit]

A free/open-source machine translation platform.

Long Description[edit]

Apertium is a primarily shallow-transfer machine translation system, which uses finite state transducers for all of its lexical transformations, and hidden Markov models and/or constraint grammars for part-of-speech tagging or word category disambiguation.

Existing machine translation systems available at present are mostly commercial and use proprietary technologies, which makes them very hard to adapt to new usages; furthermore, they use different technologies across language pairs, which makes it very difficult, for instance, to integrate them in a single multilingual content management system. Finally, most of them are not available for most of the languages in the world, as they rely heavily on resources that are available for only a few languages.

Apertium uses language-independent formalisms to allow for the ease of contributing to Apertium, more efficient development, and enhancing the project's overall growth.

At present, Apertium has released around 50 stable language pairs, delivering fast translation with reasonably intelligible or excellent results depending on the language pair. Being an open-source project, Apertium provides tools for potential developers to build their own language pair and contribute to the project.

Application Instructions[edit]

Top tips and template: http://wiki.apertium.org/wiki/Top_tips_for_GSOC_applications

Proposal Tags[edit]

  • new language pair
  • improve language pair
  • engine improvements
  • evaluation
  • website improvements

Chat, Mailing List, or Email[edit]

Chat: http://wiki.apertium.org/wiki/IRC

Mailing list: https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Application[edit]

Why does your org want to participate in Google Summer of Code?[edit]

Apertium has been part of GSoC for ten years and it has been a great experience. Apertium loves GSoC: it supports free/open-source (FOS) software as much as we do! Apertium needs GSoC: it offers an incredible opportunity (and resources!) allowing us to spread the word about our project, to attract new developers and consolidate the contribution of existing developers through mentoring, and to improve the platform in many ways: improving the engine, generating new tools and user interfaces, making Apertium available to other applications, improving the quality of the languages currently supported, adding new languages to it. Apertium loves less-resourced languages and GSoC gives an opportunity for computer-literate students speaking them to generate FOS language technologies for them. Apertium will gain: more students getting to know FOS software and the ethos that comes with it, contributing to it and, very especially, students who are passionate about languages and computers.

What would your org consider to be a successful summer?[edit]

New contributors, new features completed, more code written, better being able to guide new developers into open source world, etc.

A successful summer would see any combination of newly released language pairs, the addition of new technologies to the Apertium framework, the addition of features to our web infrastructure, and a fresh round of students becoming excited by Apertium. We would especially be happy to see a successful project form the basis of a published academic paper and to gain new long-term contributors.


How many potential mentors have agreed to mentor this year?[edit]

  • Jonathan
  • Flammie
  • Fran
  • Mikel
  • Sevilay
  • Hèctor
  • Xavi
  • Jack
  • Hossep
  • Nick
  • ...

How will you keep mentors engaged with their students?[edit]

We select our mentors from among very active developers, with long-term commitment to this 17-year-old project — they are people we know well and whom we have met face-to-face at conferences, workshops, or even in daily life; some of them teach and do research at universities or work at companies using Apertium. For this reason, it is quite unlikely for mentors to disappear, since most of them have been embedded in our community for years. However, there is always the possibility that some problem comes up, so we also assign back-up mentors to all students, in many cases more than one back-up. If a mentor cannot continue for whatever reason, one of the backup co-mentors will take over, and one of the organisation administrators (themselves experienced GSoC mentors) will take on the role of second backup mentor.

How will you help your students stay on schedule to complete their projects?[edit]

Apertium only accepts applications with a well-defined weekly schedule, clear milestones and deliverables, and, if possible, a section on risk management (risks, their probability, their severity, & mitigating actions). Applications should also plan for holidays, exams, and other absences. Students will be encouraged to let us know if they need to reschedule or take a break if needed. Students may also need consultation when they are stuck, or personal matters interfere with their work: we will, as we have in the past, try our best to reach out to them, be open and friendly, and provide as much support as we can to help them out. We've been students too! Detailed scheduling will avoid both mentors and students wasting time. If a mentor reports the unscheduled disappearance of a student (unexpected 72-hour silence), the student will be contacted by the administrators. If silence persists, their task will be frozen and we will report to Google, to proceed according to the rules of GSoC.

How will you get your students involved in your community during GSoC?[edit]

First, we encourage all prospective students to visit our IRC channel (freenode.net#apertium) as often as possible, even before the start of the program, since that will help them find a suitable mentor and a useful project that they can work on. We advise them strongly to read our wiki pages and manuals, use our system, try to break it and fix it, and finally tell us about it. As a result, students get familiar with Apertium before the coding period starts, which increases their chances of ending up with a successful project. In addition, we define coding challenges for each of the proposed projects, which serve both as an entry task, and as a means for getting our students familiar with Apertium and involved in our community in the early stages of the program. Finally, during the coding stage, we are available to talk to our students on a daily basis and give them suggestions and advice when they get stuck.

How will you keep students involved with your community after GSoC?[edit]

We have found that the following has helped us have quite a high retention rate in previous years: Helping students out publishing papers for conferences, or assisting with academic work. Organising workshops (such as FreeRBMT) or courses (such as http://goo.gl/jzre7e) where students can present their work to the wider community. Encouraging students to themselves get involved in mentoring. , through the Google Code-In programme. Passing on information about MSc and PhD positions, and academic and other grants.

Has your org been accepted as a mentor org in Google Summer of Code before?[edit]

Yes

Which years did your org participate in GSoC?[edit]

2009-2014, 2016-2020

How many students did your org accept for 2020?[edit]

8

How many of your org's 2020 students have been active in your community in the past 60 days?[edit]

2?

If your org has applied for GSoC before but not been accepted, select the years:[edit]

2015

Student counts per year[edit]

e.g. 2016: 3/4
  • 2009: 8 pass out of 9
  • 2010: 8 pass out of 9
  • 2011: 9 pass out of 11
  • 2012: 10 pass out of 11
  • 2013: 10 pass out of 11
  • 2014: 15 pass out of 16
  • 2016: 11 pass out of 12
  • 2017: 10 pass out of 12
  • 2018: 11 pass out of 14 (??)
  • 2019: 10 pass out of 12
  • 2020: 7 pass out of 8

linearised: 2009: 8/9, 2010: 8/9, 2011: 9/11, 2012: 10/11, 2013: 10/11, 2014: 15/16, 2016: 11/12, 2017: 10/12, 2018: 11/14, 2019: 10/12, 2020: 7/8

Refer an organisation[edit]

Is there an organization new to GSoC that you would like to refer to the program for 2021? Feel free to add a few words about why they'd be a good fit.

Divvun! We periodically collaborate with them, and share some of the same big-picture goals. They'd be great for GSoC, both because they're trying to make the world a better place, and because of their existing connections to academia.

OmegaT?

What year was your project started?[edit]

2004 (first Google Summer of Code 2009)

Where does your source code live?[edit]

http://github.com/apertium (and some of it still in http://sf.net/projects/apertium)

Is your organization part of any government?[edit]

No.