User:Hiten
Contents
Contact Information
Name: Hiten Vidhani
Location: India
University: Birla Institute of Technology and Science Pilani
Email address: vidhani.hiten2001@gmail.com
IRC: @hi101:matrix.org
Timezone: GMT+5:30
Github: hitenvidhani
Why is it that you are interested in Apertium?
Which of the published tasks are you interested in? What do you plan to do?
I am interested in the task "Bring an unreleased translation pair to releasable quality". I plan to develop the Marwari-Hindi(MWR-HIN) pair.
Proposal
Deliverables:
- Creating the MWR-HIN bilingual dictionary.
- Creating the MWR monolingual dictionary
- Updating the HIN monolingual dictionary, if required.
- Building the transfer rules for the MWR-HIN pair.
- Creating a MWR-HIN translator.
Reasons why Google and Apertium should sponsor it:
- Marwari has about 22 million speakers from India and neighboring countries of India. Despite the popularity the major translation tools like Google Translate don't include it.
- The project adds diversity to Apertium by including Marwari
- This project will be an important addition to the community, which could further be used to build projects or carry out research in low-resource languages which is a growing research area.
- By releasing the first MWR-HIN translator open-source it will further benefit developers in building more related language pairs to Marwari.
How and who it will benefit in society
- The project will benefit the native speakers of this language and the people travelling to Rajasthan which is an Indian state where the most used language is Marwari. The state of Rajasthan is an important tourist attraction all over the world. It would also help tourists to communicate with local people of Rajasathan.
- It will also help researchers in Natural Language Processing to carry out their research in Marwari.
- The developers can use this project to create other language pairs which are closely related to Marwari.
- In the long run, this project aims to reduce the language barrier which exists where people of two different regions find it difficult to communicate.
Work plan
Community bonding period (May 4 - May 28):
- Getting introduced to the organization and community of Apertium.
- Understanding the code/projects which would be needed as a reference for my project.
- Discussing the project ideas and taking suggestions from the community regarding the implementation of the project.
- Exploring and finding resources for Marwari.
Work Period (May 29 - 28 Aug):
Week 1:
Week 2:
Week 3:
Week 4:
Deliverable 1:
Week 5:
Week 6:
Week 7:
Week 8:
- Prepare for the second evaluation
Deliverable 2: Week 9:
Week 10:
Week 11:
Week 12:
- Project completed
I am a senior Computer Science undergraduate at Birla Institute of Technology and Science Pilani(BITS Pilani), India, which is an institute of Eminence. I have also done my internship at Ericsson where I build a NLP based ticket-classifier using python. I also developed a POS tagger for Hin-Eng code mixed dataset as a part of the Natural Language Processing coursework in my university.
List your skills and give evidence of your qualifications. Tell us what is your current field of study, major, etc. Convince us that you can do the work.
List any non-Summer-of-Code plans you have for the Summer, especially employment, if you are applying for internships, and class-taking. Be specific about schedules and time commitments. we would like to be sure you have at least 30 free hours a week to develop for our project.