User:Surajkawade/GSOC proposal: Marathi and English

From Apertium
Jump to navigation Jump to search

Name

Suraj Kawade

Contact information

IRC nick : develover

E-mail : suraj.kawade@gmail.com / suraj.kawade@hotmail.com

Phone no : +918983005859 / +919404943130

skype username : yesiamsuraj

blog :: http://develover.wordpress.com/

Why are you interested in machine translation?

As I am interested in linguistics and I love programming, machine translation is magnet for me! World is culturally diverse and languages are barrier cum ways to these cultures. I have read that (on Wikipedia) "There are between 6000 and 7000 languages currently spoken, and that between 50-90% of those will have become extinct by the year 2100". I was shocked but I don't want to feel helpless. Though humans speak in different tongues, they express the same thing! Then why shouldn't I gather my curiosity to know more how these languages are related and how they differ in something? And to help society not to blank out the gift their ancestors gave them? Everything is going digital and fast and so is the field of NLP, and MT is helping a large part in it and I want to be a (small though) part of it.

Why are you interested in the Apertium project?

The best things in the world are free (as in 'freedom')! Open Source is free and Apertium is Open Source. So by the law of commutativity Apertium is best thing. If I say I do not want languages dying in front of my eyes, I should help avoiding it and thus I found Apertium. I think Apertium is community of knowledgeable, inspiring people who are really enthusiastic on a common cause and most importantly, they love what they do and the other way around.(I figured this out while talking to them in the IRC channel.) And most importantly to "do" something for preserving a language, with Apertium, you really need less resources at the beginning, which is really helpful, less hectic and hence encouraging. Apertium uses rule-based translation methods and not the dictionary based, which makes it work with the meanings of words and not just the words, hence more close to humans.

Why Google and Apertium should sponsor it?

On knowing there is nothing done of release quality in Apertium regarding Marathi, I decided will work on it. Marathi is written in Devanagari script and Apertium is yet to release pair containing a Devanagari script language(most of them are in incubator). Doing extensive work and bringing Marathi-English pair to release quality will also encourage adaptation of those Devanagari languages in incubator.

How and who it will benefit in society?

Which of the published tasks are you interested in? What do you plan to do?

Work plan

Coding challenge

Community Bonding Period

Week 1

Week 2

Week 3

Week 4

Week 5

Week 6

Week 7

Week 8

Week 9

Week 10

Week 11

Week 12

List your skills and give evidence of your qualifications

My non-Summer-of-Code plans for the Summer