Difference between revisions of "User:Ramzz1"
Line 1: | Line 1: | ||
== Name == |
== Name == |
||
N. Kiran Kumar |
N. Kiran Kumar |
||
== Affiliation == |
== Affiliation == |
||
Fourth year BTech + MS by Research student, Department of Computer Science and Engineering, International Institute of Information Technology-Hyderabad,INDIA. |
Fourth year BTech + MS by Research student, Department of Computer Science and Engineering, International Institute of Information Technology-Hyderabad,INDIA. |
||
== Email Address == |
== Email Address == |
||
kirankumar.iiit@gmail.com |
kirankumar.iiit@gmail.com |
||
== Contact Information == |
== Contact Information == |
||
IRC: ramzz@irc.freenode.net<br> |
IRC: ramzz@irc.freenode.net<br> |
||
Phone: +91 9290447116 |
Phone: +91 9290447116 |
||
== Why is it you are interested in machine translation? == |
== Why is it you are interested in machine translation? == |
||
Line 26: | Line 22: | ||
== Which of the published tasks are you interested in? == |
== Which of the published tasks are you interested in? == |
||
I am interested to work on “POST-EDITION-TOOL” task. |
I am interested to work on “POST-EDITION-TOOL” task. |
||
== Project Description == |
== Project Description == |
||
The main intent of this project is to build a tool that supports editing Apertium MT system output. The tool must support any pair of languages (available in Apertium) and it has to deal with various errors in the translation like “Wrongly spelt words”, “unknown words”, “grammar mistakes”, ”sentences with wrong order”, “non-native phrases”, etc. Detailed description of the project can be found here. |
The main intent of this project is to build a tool that supports editing Apertium MT system output. The tool must support any pair of languages (available in Apertium) and it has to deal with various errors in the translation like “Wrongly spelt words”, “unknown words”, “grammar mistakes”, ”sentences with wrong order”, “non-native phrases”, etc. Detailed description of the project can be found here. |
||
== Reasons why Google and Apertium should sponsor it. == |
== Reasons why Google and Apertium should sponsor it. == |
||
This project will definitely make the Apertium Translator better and thus helps in increasing its performance. Number of users who will be using this tool will definitely increase. |
This project will definitely make the Apertium Translator better and thus helps in increasing its performance. Number of users who will be using this tool will definitely increase. |
||
== A description of how and who it will benefit in society. == |
== A description of how and who it will benefit in society. == |
||
Line 41: | Line 34: | ||
== Work Plan == |
== Work Plan == |
||
Google Official Coding starts from May 24th. If I am quite confident about things before this date, I will start working on the project before May 24th. I am planning to work more or less as per the following schedule. |
Google Official Coding starts from May 24th. If I am quite confident about things before this date, I will start working on the project before May 24th. I am planning to work more or less as per the following schedule. |
||
'''Timeline''' |
'''Timeline''' |
||
Line 81: | Line 73: | ||
I don’t have the experience of working in the Open source Projects. I have participated in the “Text Analysis Conference(TAC) 2009” and published a paper on “Recognizing Textual Entailment” in it. I have done that project in “java” and made use of various tools such as “WordNet , Monty lingua, VerbOcean, Stanford Parser and Stanford NER” for the project. |
I don’t have the experience of working in the Open source Projects. I have participated in the “Text Analysis Conference(TAC) 2009” and published a paper on “Recognizing Textual Entailment” in it. I have done that project in “java” and made use of various tools such as “WordNet , Monty lingua, VerbOcean, Stanford Parser and Stanford NER” for the project. |
||
== List any non-Summer-of-Code plans you have for the Summer. == |
== List any non-Summer-of-Code plans you have for the Summer. == |
Revision as of 05:09, 9 April 2010
Contents
- 1 Name
- 2 Affiliation
- 3 Email Address
- 4 Contact Information
- 5 Why is it you are interested in machine translation?
- 6 Why is it that you are interested in the Apertium project?
- 7 Which of the published tasks are you interested in?
- 8 Project Description
- 9 Reasons why Google and Apertium should sponsor it.
- 10 A description of how and who it will benefit in society.
- 11 Work Plan
- 12 List your skills and give evidence of your qualifications.
- 13 List any non-Summer-of-Code plans you have for the Summer.
Name
N. Kiran Kumar
Affiliation
Fourth year BTech + MS by Research student, Department of Computer Science and Engineering, International Institute of Information Technology-Hyderabad,INDIA.
Email Address
kirankumar.iiit@gmail.com
Contact Information
IRC: ramzz@irc.freenode.net
Phone: +91 9290447116
Why is it you are interested in machine translation?
As I am a student of Computer Science, I have done various courses such as Artificial Intelligence, pattern Recognition and Natural language Processing during my academics. Also I did various projects related to the field of machine Learning and Machine translation. Various methods/techniques such as “transfer grammar rules”, “pattern matching” and the way Machine translation is helpful for the people of the society has grabbed my attention to it. Also, since I am from the linguistic background, I love to work on the things related to Machine Learning and Machine Translation. Also there is a large scope for meeting people of different places and also I can get to know the culture, history and the tradition of various places and groups.
Why is it that you are interested in the Apertium project?
Working on ‘Language related’ projects is my strength since I am from the background of “Langauge Technologies”. Apertium is working on the Machine translation of the various world languages for which the resources availability is limited. This makes it as a challenging project. These kind of open source projects are very much useful to the society. I was introduced to the word "Open Source" through Linux when I was in my first year of graduation, I liked the idea open (free) to all but never thought of contributing to an Open Source. As I started using open source projects, I began to appreciate the role of a good open source project in the development of other projects and became interested in them. That is the reason why I have chosen to work on the Apertium project.
Which of the published tasks are you interested in?
I am interested to work on “POST-EDITION-TOOL” task.
Project Description
The main intent of this project is to build a tool that supports editing Apertium MT system output. The tool must support any pair of languages (available in Apertium) and it has to deal with various errors in the translation like “Wrongly spelt words”, “unknown words”, “grammar mistakes”, ”sentences with wrong order”, “non-native phrases”, etc. Detailed description of the project can be found here.
Reasons why Google and Apertium should sponsor it.
This project will definitely make the Apertium Translator better and thus helps in increasing its performance. Number of users who will be using this tool will definitely increase.
A description of how and who it will benefit in society.
By using an open source Machine translation system such as Apertium (with a better accuracy), people can translate text from one language to the desired language at free of cost. This will be of great help to them. Also some of the European Companies which are spending Billions of rupees can make use of such open source systems to make a cost effective use. Consider an example: I am from India where there are many states and different languages for each state. When we go from one state to another during summer vacation for spending holidays, we experience difficulty in interacting with the people of other state. Also when we go for higher studies to different places we may face difficulties due to the lack of interaction which is mainly because of the language gap. For all such kind of common people this project will be of great help.
Work Plan
Google Official Coding starts from May 24th. If I am quite confident about things before this date, I will start working on the project before May 24th. I am planning to work more or less as per the following schedule. Timeline
Week1: Basic prototype of the UI and suggestions from the mentor.
Week2: Understanding the various rules/features in “Language tool” and sketch a plan on how to incorporate useful new rules/features into the post-edition tool.
Week3: Integrate these new rules/features with the “Language tool” which forms the baseline system of Post-edition-tool.
Week4: Implement the “spell checker” and also build the “Unknown word” detector module by making use of the Apertium monolingual dictionary and web (if needed).
Deliverable#1 =========== A basic Post-edition–tool with “Language tool” and extra features/rules added above it, and also integrated with the “Spell checker” and “Unknown word” detection module.
Week5: Implement a “Grammar checker” using the POS Tags and n-gram model (bigram/ trigram depending on the better results). Testing is done as part of development.
Week6: Implement the “Spell suggestor” using Apertium monolingual dictionary, web and various edit distance algorithms.
Week7: Integrate the “Spell suggestor” and the “Grammar checker” to the baseline system.
Week8: Implement the other add-ons like “Identifying non-native phrases” etc using the Apertium monolingual dictionary, POS Tags, n-gram model and integrating into the system and Testing the system.
Deliverable#2 =========== Build a Post-edition tool with various modules integrated into it, which includes “spell checker and suggestor”, “ non-native phrases” identifier, “Grammar checker”.
Week9: Improving the Front-End UI by adding login facility, logging all editions etc.
Week10-11: Final testing, Code Documentation and user how-to documentation
Week12: Backup week (In case of any new improvements to be made).
Project completed
List your skills and give evidence of your qualifications.
JAVA/web knowledge experience: I find Java exciting and fun to work with. I learnt java out of my own interest. I did most of my projects in java. I have been working with it for 2years. I have also worked on web related projects using JSP and MySQL as a part of my academic course project.
Database experience I have used MySQL and built a small database as a part of my course. Currently, I am in 4th year B.Tech + M.S by Research program in the department of Computer Science and Engineering at International Institute of Information Technology, Hyderabad, Andhra Pradesh, India.
I don’t have the experience of working in the Open source Projects. I have participated in the “Text Analysis Conference(TAC) 2009” and published a paper on “Recognizing Textual Entailment” in it. I have done that project in “java” and made use of various tools such as “WordNet , Monty lingua, VerbOcean, Stanford Parser and Stanford NER” for the project.
List any non-Summer-of-Code plans you have for the Summer.
I will finish my academic work and other stuff by May 10th. From then, I will start learning more about the project related things. I am not engaged in any other works this summer. I have three months of summer vacation. I am planning to spend most of this time on “Google Summer of Code” only. I am willing to spend minimum 5 hours a day for 6-7 days a week. So overall I spend around 30-35 hours a week. Many times, I may spend a whole day doing nothing except working on the project. (around 10 hrs)