User:Jalopeura/GSOC2010Application

From Apertium
Jump to navigation Jump to search

Sean Healy Gmail username: sean.max Hotmail username: jalopeura

I am a Masters student in Natural Language Processing; I will defend my thesis in June 2011. I am interested in doing a GSOC project for Apertium because I would like to see how other people are doing machine translation.

I am a native speaker of English, and have the following other language skills appropriate to my project idea:

French: Minored in it, good explicit knowledge of grammar, but until recently not much practice in speaking it with native speakers. However, I have been studying in France for the last six months and steadily improving.

Portuguese (Brazilian): Lived for three years with a Brazilian roommate while taking Portuguese classes; we spoke mostly Portuguese in the apartment.

As far as programming, I know both Perl and PHP. I was a professional programmer for eight years before returning to school, and have experience with additional technologies, but these seem the most relevant to the project.

I have participated, both through mailing list discussions and code contributions, to multiple Perl modules. I have also been following the development of the Haiku operating system. I have not yet contributed any code to the project, but I have done programming in the OS.

French-Portuguese pair for Apertium

A new language pair is always good for Apertium's visibility, and it will of course benefit users needing this particular language pair. As one Apertium contributor put it, language pairs are Apertium's "bread and butter", so this project will contribute to Apertium in a menaningful way.

I have a large due June 2, and I must present it at the end of June, so I will have other obligations during Weeks 1, 2 and 6. I foresee no difficulties in finding 30 hours during Weeks 2 and 6, but during Week 1 I may be unable to spend a full 30 hours on this project. I have no other outside constraints on my time during the 12 weeks of GSOC 2010.

Tasks to complete: Convert bilingual dictionary to Apertium format Create French monolingual dictionary from existing pairs Add words from bilingual dictionary not already present Verify coverage Create Portuguese monolingual dictionary from existing pairs Add words from bilingual dictionary not already present Verify coverage Create transfer rules Test/debug

Deliverables: Dictionaries for this pair Final deliverable: Functioning language pair