Difference between revisions of "Google Summer of Code/Application 2009"

From Apertium
Jump to navigation Jump to search
m
 
(29 intermediate revisions by 3 users not shown)
Line 1: Line 1:
<div style="align: center; border-collapse: collapse; background: #fbfbfb; border: 1px solid #aaa; border-left: 10px solid #1e90ff;">
&nbsp;&nbsp;This application is closed.
</div>
{{TOCD}}
{{TOCD}}
This page lists our current application for Google Summer of Code. The ideas page can be found [[Ideas for Google Summer of Code|here]].
This page lists our current application for Google Summer of Code. The ideas page can be found [[Ideas for Google Summer of Code|here]].

==Current application==

Notes for applicants here: [http://groups.google.com/group/google-summer-of-code-announce/web/notes-on-organization-selection-criteria selection criteria], and [http://code.google.com/p/google-summer-of-code/wiki/AdviceforMentors advice for mentors]

Answers to the descriptive questions should probably be 2--3 paragraphs at most, according to advice from #gsoc.

Fill out the application form [http://code.google.com/soc/2009/org_signup.html here].


===Application===
===Application===
Line 14: Line 9:
;Describe your organisation.
;Describe your organisation.


* The Apertium project is a project which works on open-source machine translation and language technology. We try and focus our efforts on lesser-resourced and marginalised languages, but also work with larger languages.
* Two organisations team up for GSoC. One is the [http://transducens.dlsi.ua.es Transducens research group] of the [http://www.ua.es Universitat d'Alacant] (Alacant, Spain); the other one is [http://www.prompsit.com Prompsit Language Engineering]. These two organisations are currently responsible for most of the development taking place in the [http://www.apertium.org Apertium] open-source machine translation platform.
* The project, including language data, translation engine and auxiliary tools is being developed in several universities and companies around the world, with the principal part of the development on the engine being done by the [http://transducens.dlsi.ua.es Transducens research group] of the [http://www.ua.es Universitat d'Alacant] (Alacant, Spain) and [http://www.prompsit.com Prompsit Language Engineering].
* Apertium is a platform for developing rule-based machine translation systems. It was initially targeted at closely related languages (particularly the Romance languages), where it is possible to get a very high degree of accuracy in translation. Recent developments have made it possible to create systems to translate less-closely related languages. We have 17 published language pairs, and several more are currently in development.
* There are currently 17 published language pairs within the project (including a number of "firsts" &mdash; for example Spanish&mdash;Occitan and Basque&mdash;Spanish among others), and several more in development.


;Why is your organisation applying to participate in GSoC 2008? What do you hope to gain by participating?
;Why is your organisation applying to participate in GSoC 2009? What do you hope to gain by participating?


* Both organisations are very interested in seeing Apertium improve in many different directions. The Universitat, mainly because most of its research in the field of machine translation is based on Apertium components. Prompsit, because it bases its business in providing Apertium-based services.
* We are very interested in seeing Apertium improve as both a research platform, and as a platform for spreading open-source software in the translation world. As a whole, we will benefit from increased participation from outside the core group of developers: we will get new or improved tools which will help to improve translation quality for users and developers alike.
* We have found that although it is possible to attract developers interested working on language pairs, it is more difficult to find developers who are interested in work on the engine, so we would hope to find students interested in "diving a bit deeper".
* Apertium as a whole will benefit from increased participation from outside the core group of developers: we will get new or improved tools which will help to improve translation quality for users and developers alike.


;Did your organisation participate in past GSoCs? If so, please summarise your involvement and the successes and challenges of your participation.
;Did your organisation participate in past GSoCs? If so, please summarise your involvement and the successes and challenges of your participation.
Line 28: Line 24:
;If your organisation has not previously participated in GSoC, have you applied in the past? If so, for what year(s)?
;If your organisation has not previously participated in GSoC, have you applied in the past? If so, for what year(s)?


* We applied in 2008, but unfortunately did not get through the selection procedure. We received some helpful feedback which we are taking into account when applying this year.
* We applied in 2008, but unfortunately did not get through the selection procedure. We received some helpful feedback which we have tried to follow in this last year, and we are taking it into account when applying this year.

;Who will your organisation administrator be? Please include Google Account information.

* Mikel L. Forcada (Grup Transducens, Universitat d'Alacant), <mikel.forcada at gmail.com>


;What licence(s) does your project use?
;What licence(s) does your project use?
Line 40: Line 32:
;What is the URL for your ideas page?
;What is the URL for your ideas page?


* http://wiki.apertium.org/wiki/Projects
* http://wiki.apertium.org/wiki/Ideas_for_Google_Summer_of_Code


;What is the main development mailing list or forum for your organisation?
;What is the main development mailing list or forum for your organisation?
Line 54: Line 46:
* We expect students to contact us using IRC or e-mail; we will make sure we get the following information from all applicants:
* We expect students to contact us using IRC or e-mail; we will make sure we get the following information from all applicants:


:* Name and e-mail address
:* Name, e-mail address, and other information that may be useful for contact

:* Current field of study / major
:* Why is it you are interested in machine translation?
:* Whether they have programmed before in an open-source project

:* Why is it that they are interested in machine translation
:* Why is it that they are interested in the Apertium project
:* Why is it that they are interested in the Apertium project?

:* Which task they are interested in, and why
:* Which of the published tasks are you interested in? What do you plan to do?
Include a one- or two-page proposal, including a title, reasons why Google and Apertium should sponsor it, a description of how and who it will benefit in society, and a detailed work plan including, if possible, a brief schedule with milestones and deliverables. Include time needed to think, to program, to document and to disseminate.

:* List your skills and give evidence of your qualifications. Tell us what is current field of study, major, etc. Convince us that you can do the work. In particular we would like to know whether you have programmed before in open-source projects.

:* Please list any non-Summer-of-Code plans you have for the Summer, especially employment and class-taking. Be specific about schedules and time commitments. we would like to be sure you have at least 10 free hours a week to develop for our project.


;Who will be your backup organisation administrator? Please include Google Account information.
;Who will be your backup organisation administrator? Please include Google Account information.


* Gema Ramírez Sánchez (Prompsit Language Engineering S.L., <gramirez at gmail.com>
* Gema Ramírez Sánchez (Prompsit Language Engineering S.L.), <gramirez at gmail.com>


;Who will your mentors be? Please include Google Account information.
;Who will your mentors be? Please include Google Account information.
Line 72: Line 70:
* Felipe Sánchez Martínez <fsanchez at gmail.com>
* Felipe Sánchez Martínez <fsanchez at gmail.com>
* Sergio Ortiz Rojas <sergio.ortiz at gmail.com>
* Sergio Ortiz Rojas <sergio.ortiz at gmail.com>
* Jacob Nordfalk <jacob.nordfalk at gmail.com>
* Kevin Donnelly <donnek at gmail.com>
* Miguel Gea Milvaques <miguelgea at gmail.com>


;What criteria did you use to select these individuals as mentors? Please be as specific as possible.
;What criteria did you use to select these individuals as mentors? Please be as specific as possible.


* They are all developers at the Apertium project:
* They are all long-standing experienced developers in the Apertium project, we have tried to choose a selection of developers interested in more linguistically motivated areas, and more computationally motivated areas of development.


:* Francis Tyers is one of the main coordinators of development being done outside the Universitat d'Alacant or Prompsit. He is a graduate student at the Universitat d'Alacant and also works for Prompsit Language Engineering. He has been responsible for the current visibility of Apertium in Debian and Ubuntu, has set up the Apertium wiki, takes care of the #apertium IRC channel, etc.
:* Francis Tyers is a graduate student of Computer Science at the Universitat d'Alacant and also works for Prompsit Language Engineering. He is one of the main developers on the Welsh to English translator, is working on the Breton to French translator, and is responsible for Debian packaging and general build maintenance.


:* Mikel L. Forcada is a Professor of Computer Science and has led all of the research that has been done at the Universitat d'Alacant in the field of machine translation. He is responsible for much of the current design of Apertium.
:* Mikel L. Forcada is a professor of Computer Science and has led all of the research that has been done at the Universitat d'Alacant in the field of machine translation. He is responsible for much of the current design of Apertium.


:* Jimmy O'Regan is based in Ireland, he is the instigator and developer of the English--Polish language pair, and also works on Irish. He has also been a writer for the Linux Gazette.
:* Jimmy O'Regan is based in Ireland, he is the instigator and developer of the English--Polish language pair, and also works on Irish. He has also been a writer for the Linux Gazette.


:* Felipe Sánchez Martínez is a graduate student at the Universitat d'Alacant under the supervision of Mikel L. Forcada. He is responsible for coding the part-of-speech tagger of Apertium as well as the maintainer of packages apertium-tagger-training-tools and apertium-transfer-tools, which allow developers of Apertium language-pair data to induce the part-of-speech tagger and an initial set of translation rules from monolingual and bilingual corpora.
:* Felipe Sánchez Martínez is an lecturer in Computer Science at the Universitat d'Alacant. He is responsible for coding the part-of-speech tagger of Apertium as well as the maintainer of packages apertium-tagger-training-tools and apertium-transfer-tools, which allow developers of Apertium language-pair data to induce the part-of-speech tagger and an initial set of translation rules from monolingual and bilingual corpora.


:* Sergio Ortiz-Rojas is the senior programmer at Prompsit Language Engineering and is responsible for most of the code in Apertium (except the one written by Felipe Sánchez Martínez); he is, therefore, the developer of reference when it comes to develop new code for the platform.
:* Sergio Ortiz-Rojas is the senior programmer at Prompsit Language Engineering and is responsible for most of the engine code in Apertium; he is, therefore, the developer of reference when it comes to develop new code for the platform.

:* Jacob Nordfalk is an associate professor of Computer Science and author of several books on programming in Java in Danish. He is the primary developer on the English--Esperanto pair and has also done a lot of work on apertium-dixtools.

:* Kevin Donnelly has a doctorate in linguistics, and has been working on the Welsh to English language pair. He has worked on a number of other pieces of free software for Welsh, including the Klebran interface for the Gramadóir grammar checker, and written several articles for Linux Magazine.

:* Miguel Gea Milvaques is a developer working who is responsible for a number of packages, he is the mentor of Francis Tyers in the Debian project and is involved in the maintenance of Apertium. He has also written the Apertium plugin for OpenOffice.


;What is your plan for dealing with disappearing students?
;What is your plan for dealing with disappearing students?


Students will be encouraged to let us know how they want to break up their time, and to try and plan for holidays and absences. This will avoid both mentors and students wasting time. If a mentor reports the unscheduled disappearance of a student (72-hour silence), he will be contacted by the administrators. If silence persists, his task will be frozen and we will report to Google.
* Students will be encouraged to let us know how they want to break up their time, and to try and plan for holidays and absences. This will avoid both mentors and students wasting time. If a mentor reports the unscheduled disappearance of a student (72-hour silence), they will be contacted by the administrators. If silence persists, their task will be frozen and we will report to Google.


;What is your plan for dealing with disappearing mentors?
;What is your plan for dealing with disappearing mentors?


It is quite unlikely, since all of the mentors are very active developers, with long term commitment to the project. If a mentor fails to respond adequately to a student, he or she will have been instructed to contact the administrators. The administrators will examine the situation; if disappearance (48 hour silence) is confirmed, they will assign a different mentor to them, and inform Google.
* It is quite unlikely, since all of the mentors are very active developers, with long term commitment to the project. If a mentor fails to respond adequately to a student, they will have been instructed to contact the administrators. The administrators will examine the situation; if disappearance (48 hour silence) is confirmed, they will be assigned a different mentor and Google will be informed.


;What steps will you take to encourage students to interact with your project's community before, during and after the program?
;What steps will you take to encourage students to interact with your project's community before, during and after the program?


* We will make sure most developers are available as long as possible at the #apertium IRC channel, so that they get guidance with any problem they may have during development or before taking decisions on what task to select.
* Developers who have been chosen as mentors will be available for as long as possible at the #apertium IRC channel, so that the student may receive guidance with any problem they may have during development and before taking decisions on which task to select.
* We will try to get them involved as early as possible in the project, by granting them developer status, so they can modify code and data as any other developer would.
* We will try to get them involved as early as possible in the project, by granting them developer status, so they can modify code and data as any other developer would.
* Depending on the number of projects chosen for development, we will organise an optional workshop in Alacant so that the students may present their work to the wider group.
* Depending on the number of projects chosen for development, we will organise an optional workshop in Alacant so that the students may present their work in an informal academic setting to the wider group of developers.


;What will you do to ensure that your accepted students stick with the project after GSoC concludes?
;What will you do to ensure that your accepted students stick with the project after GSoC concludes?


* We will ensure that their work is well publicised and appreciated among the development community, this often gives a developer impetus to continue.
* Whenever there is a relevant research or development component in their work, we will make sure they can use it as part of their undergraduate or graduate work, and offer guidance when writing papers.
* Whenever there is a relevant research or development component in their work, we will make sure they can use it as part of their undergraduate or graduate work, and offer guidance when writing papers.
* We feel that the field of machine translation is fascinating, and as soon as they've spent a few months developing, they'll be hooked for life!


==Archived applications==

* [[/Application, 2008]]


[[Category:Google Summer of Code]]
[[Category:Google Summer of Code|Application 2009]]

Latest revision as of 19:50, 12 April 2021

  This application is closed.

Contents

This page lists our current application for Google Summer of Code. The ideas page can be found here.

Application

Describe your organisation.
  • The Apertium project is a project which works on open-source machine translation and language technology. We try and focus our efforts on lesser-resourced and marginalised languages, but also work with larger languages.
  • The project, including language data, translation engine and auxiliary tools is being developed in several universities and companies around the world, with the principal part of the development on the engine being done by the Transducens research group of the Universitat d'Alacant (Alacant, Spain) and Prompsit Language Engineering.
  • There are currently 17 published language pairs within the project (including a number of "firsts" — for example Spanish—Occitan and Basque—Spanish among others), and several more in development.
Why is your organisation applying to participate in GSoC 2009? What do you hope to gain by participating?
  • We are very interested in seeing Apertium improve as both a research platform, and as a platform for spreading open-source software in the translation world. As a whole, we will benefit from increased participation from outside the core group of developers: we will get new or improved tools which will help to improve translation quality for users and developers alike.
  • We have found that although it is possible to attract developers interested working on language pairs, it is more difficult to find developers who are interested in work on the engine, so we would hope to find students interested in "diving a bit deeper".
Did your organisation participate in past GSoCs? If so, please summarise your involvement and the successes and challenges of your participation.
  • n/a
If your organisation has not previously participated in GSoC, have you applied in the past? If so, for what year(s)?
  • We applied in 2008, but unfortunately did not get through the selection procedure. We received some helpful feedback which we have tried to follow in this last year, and we are taking it into account when applying this year.
What licence(s) does your project use?
  • GNU GPL 2.0/3.0
What is the URL for your ideas page?
What is the main development mailing list or forum for your organisation?
  • apertium-stuff@lists.sourceforge.net
What is the main IRC channel for your organisation?
  • #apertium on irc.freenode.net
Does your organisation have an application template you would like to see students use? If so, please provide it now.
  • We expect students to contact us using IRC or e-mail; we will make sure we get the following information from all applicants:
  • Name, e-mail address, and other information that may be useful for contact
  • Why is it you are interested in machine translation?
  • Why is it that they are interested in the Apertium project?
  • Which of the published tasks are you interested in? What do you plan to do?

Include a one- or two-page proposal, including a title, reasons why Google and Apertium should sponsor it, a description of how and who it will benefit in society, and a detailed work plan including, if possible, a brief schedule with milestones and deliverables. Include time needed to think, to program, to document and to disseminate.

  • List your skills and give evidence of your qualifications. Tell us what is current field of study, major, etc. Convince us that you can do the work. In particular we would like to know whether you have programmed before in open-source projects.
  • Please list any non-Summer-of-Code plans you have for the Summer, especially employment and class-taking. Be specific about schedules and time commitments. we would like to be sure you have at least 10 free hours a week to develop for our project.
Who will be your backup organisation administrator? Please include Google Account information.
  • Gema Ramírez Sánchez (Prompsit Language Engineering S.L.), <gramirez at gmail.com>
Who will your mentors be? Please include Google Account information.
  • Francis Tyers <francis.tyers at gmail.com>
  • Mikel L. Forcada <mikel.forcada at gmail.com>
  • Jimmy O'Regan <joregan at gmail.com>
  • Felipe Sánchez Martínez <fsanchez at gmail.com>
  • Sergio Ortiz Rojas <sergio.ortiz at gmail.com>
  • Jacob Nordfalk <jacob.nordfalk at gmail.com>
  • Kevin Donnelly <donnek at gmail.com>
  • Miguel Gea Milvaques <miguelgea at gmail.com>
What criteria did you use to select these individuals as mentors? Please be as specific as possible.
  • They are all long-standing experienced developers in the Apertium project, we have tried to choose a selection of developers interested in more linguistically motivated areas, and more computationally motivated areas of development.
  • Francis Tyers is a graduate student of Computer Science at the Universitat d'Alacant and also works for Prompsit Language Engineering. He is one of the main developers on the Welsh to English translator, is working on the Breton to French translator, and is responsible for Debian packaging and general build maintenance.
  • Mikel L. Forcada is a professor of Computer Science and has led all of the research that has been done at the Universitat d'Alacant in the field of machine translation. He is responsible for much of the current design of Apertium.
  • Jimmy O'Regan is based in Ireland, he is the instigator and developer of the English--Polish language pair, and also works on Irish. He has also been a writer for the Linux Gazette.
  • Felipe Sánchez Martínez is an lecturer in Computer Science at the Universitat d'Alacant. He is responsible for coding the part-of-speech tagger of Apertium as well as the maintainer of packages apertium-tagger-training-tools and apertium-transfer-tools, which allow developers of Apertium language-pair data to induce the part-of-speech tagger and an initial set of translation rules from monolingual and bilingual corpora.
  • Sergio Ortiz-Rojas is the senior programmer at Prompsit Language Engineering and is responsible for most of the engine code in Apertium; he is, therefore, the developer of reference when it comes to develop new code for the platform.
  • Jacob Nordfalk is an associate professor of Computer Science and author of several books on programming in Java in Danish. He is the primary developer on the English--Esperanto pair and has also done a lot of work on apertium-dixtools.
  • Kevin Donnelly has a doctorate in linguistics, and has been working on the Welsh to English language pair. He has worked on a number of other pieces of free software for Welsh, including the Klebran interface for the Gramadóir grammar checker, and written several articles for Linux Magazine.
  • Miguel Gea Milvaques is a developer working who is responsible for a number of packages, he is the mentor of Francis Tyers in the Debian project and is involved in the maintenance of Apertium. He has also written the Apertium plugin for OpenOffice.
What is your plan for dealing with disappearing students?
  • Students will be encouraged to let us know how they want to break up their time, and to try and plan for holidays and absences. This will avoid both mentors and students wasting time. If a mentor reports the unscheduled disappearance of a student (72-hour silence), they will be contacted by the administrators. If silence persists, their task will be frozen and we will report to Google.
What is your plan for dealing with disappearing mentors?
  • It is quite unlikely, since all of the mentors are very active developers, with long term commitment to the project. If a mentor fails to respond adequately to a student, they will have been instructed to contact the administrators. The administrators will examine the situation; if disappearance (48 hour silence) is confirmed, they will be assigned a different mentor and Google will be informed.
What steps will you take to encourage students to interact with your project's community before, during and after the program?
  • Developers who have been chosen as mentors will be available for as long as possible at the #apertium IRC channel, so that the student may receive guidance with any problem they may have during development and before taking decisions on which task to select.
  • We will try to get them involved as early as possible in the project, by granting them developer status, so they can modify code and data as any other developer would.
  • Depending on the number of projects chosen for development, we will organise an optional workshop in Alacant so that the students may present their work in an informal academic setting to the wider group of developers.
What will you do to ensure that your accepted students stick with the project after GSoC concludes?
  • We will ensure that their work is well publicised and appreciated among the development community, this often gives a developer impetus to continue.
  • Whenever there is a relevant research or development component in their work, we will make sure they can use it as part of their undergraduate or graduate work, and offer guidance when writing papers.
  • We feel that the field of machine translation is fascinating, and as soon as they've spent a few months developing, they'll be hooked for life!