User:AlexMetalhead/Application GSoC 2014

From Apertium
Jump to navigation Jump to search

!!! Work in progress !!!

Contact details

Name: Alexandru-Marian Florescu e-mail: acdc152@gmail.com I also try to stay on the IRC channel as much as possible, so you can find me there most of the time.

Interest in machine translation

Giving my passion for computers and programming, it makes sense that I’d be interested in machine translating aswell. I find it very interesting how computers, although using a slightly different process, can still make pretty accurate translations. I also find working on this, great way of training my brain into thinking more freely and more general. I have also worked on MT before, and I found it to be a very pleasing experience. (although, MT requires a lot of work, usually)

Interest in Apertium

Apertium is, in my opinion, the best open-source machine translator available at the moment. Given my interest in machine translation is normal that I will be interested in Apertium aswell. I have worked here before, as a GCI participant and I believe I’ve learned a lot and made new friends, so I hope I can repeat the experience this year, as a GSoC participant.

Concerning the project, I’m very interested in the complex multiwords compiler. I believe machine translation needs to step forward, and this is one of the ways to do it. I'm confident I can improve the way we treat with multiwords at the moment, and thus make an impact on the scope of Apertium, and the quality of the translations it makes.

Complex multiwords compiler

Reasons for Google and Apertium to sponsor it

Apertium is a MT that prioritizes the accuracy of the translation, but such precision can't be reached unless more work is invested into the actual translation process. One of the issues at the moment is the compiler dealing with different types of multiwords, which is not consistent enough, yet. Enhancing the multiwords compiler, will make Apertium’s impact much bigger. It will help by making much easier to adopt a new language pair. It will also greatly improve translations quality for most language pairs.

Who and how will benefit from this

Well, first of all, basically everyone who uses Apertium for it's main purpose will greatly benefit from this. Also, being an Open-Source project, the resulting code can become a good research material, for students studying formal languages or MT, giving how the possibilities are rather limited at the moment.

Work plan

TODO

Skills and qualifications

I am a Computer Science student at the University of Bucharest. I have participated in Google Code-In 3 years in a row, winning the Grand Prize once. I have programmed in different Open Source organisations like: Gnome, KDE, Sahana etc. as well as Apertium. I am also employed as a software developer at a local company. I have programmed in : C, C++, C#, Java, Python, PHP, Javascript. Also, I am fairly experienced with XML, having worked on tasks involving this on Apertium itself and outside of Apertium, on different occasions.

Although I am already familiar with them, I think it's worthy to mention that I am studying both C++ and Formal Languages this year, so it's safe to say that I will have no real issue implementing my ideeas.


Having worked here before as a student participating in GCI, I strongly believe I learned a lot, and I feel like GSoC is the next step to make. It's like a task, just much bigger, and more interesting.

Other plans for the summer

My time at the moment is divided between school and my part-time job (20hrs/week). By the end of June, school will be over for the summer. Also, if I am to be accepted, I will pause or drop my job for the coding period, so I can be 100% dedicated to achieving my goals. Which means that I will be available for work around 40hrs/week, maybe even more, if I am to encounted unexpected issues.