Difference between revisions of "User:Pankajksharma/Application"

From Apertium
Jump to navigation Jump to search
Line 24: Line 24:


== Proposal ==
== Proposal ==
Which of the published tasks are you interested in? What do you plan to do?

Include a proposal, including
* a title,
* reasons why Google and Apertium should sponsor it,
* a description of how and who it will benefit in society,
* and a detailed work plan (including, if possible, a brief schedule with milestones and deliverables).

'''Title''': Command line Fuzzy-match repair from Translation Memory
'''Title''': Command line Fuzzy-match repair from Translation Memory



Revision as of 14:18, 17 March 2014

Personal Information

Name: Pankaj Kumar Sharma

E-mail address: sharmapankaj1992@gmail.com

Other information that may be useful to contact:

My alternative email: pankaj@pankajksharma.com

Interest in ML and Aperitum

Why is it you are interested in machine translation?

I am interested in Machine Translation (MT) because of two reasons. The first one is little Philosophical one, i.e., the ideology of making all the digital information present openly available to everyone regardless of the language in which it's written or regardless of the language that used by the recipients. Further this would also cause in decreasing the language barrier in the exchange process of ideas.

I did my minor in Text Classification and since then become interested in Machine Learning and took me closer to NLP (a pert of MT). To be honest and I've only only used MT only as an end-user until recently.

Why is it that they are interested in the Apertium project?

I am interested in Apertium because:

Proposal

Title: Command line Fuzzy-match repair from Translation Memory


Abstract: For a given sentence S in a source language and it's translation T in another language, the idea is to find the translation of another sentence S'. The condition that S and S' must hold is that S and S' must have high Fuzzy-match score (or Low Edit Distance) between them. Then depending upon what changes from S to S' we employ (t, t') repair operations to T to get our T'.

Another phase of the project is to preprocess an existing translation memory corresponding to the source and target languages and store validated (s,t) pairs (s is a sub-sequence of S, t is a sub-sequence of T and s translates to t). These pairs could be used for generating target more better and verified (s', t') pairs.

This idea was originally given by User:mlforcada.

Headline text

Include time needed to think, to program, to document and to disseminate.

List your skills and give evidence of your qualifications. Tell us what is your current field of study, major, etc. Convince us that you can do the work. In particular we would like to know whether you have programmed before in open-source projects.

List any non-Summer-of-Code plans you have for the Summer, especially employment, if you are applying for internships, and class-taking. Be specific about schedules and time commitments. we would like to be sure you have at least 30 free hours a week to develop for our project. No I don't have any other engagement for the Summer and would be more than happy to devote 30+ hours every week for this project.