Difference between revisions of "User:Hiten"

From Apertium
Jump to navigation Jump to search
Line 28: Line 28:
 
* Creating a MWR-HIN translator.
 
* Creating a MWR-HIN translator.
   
Reasons why Google and Apertium should sponsor it:
+
== Reasons why Google and Apertium should sponsor it: ==
* Marwari has about 22 million speakers from India and neighboring countries of India. Despite the popularity the major translation tools like Google Translate don't include it.
+
* Marwari has about 22 million speakers from India and neighboring countries of India. Despite the popularity the major translation tools like Google Translate don't include it.
* The project adds diversity to Apertium by including Marwari.
+
* The project adds diversity to Apertium by including Marwari
  +
* This project will be an important addition to the community, which could further be used to build projects or carry out research in low-resource languages which is a growing research area.
 
  +
* By releasing the first MWR-HIN translator open-source it will further benefit developers in building more related language pairs to Marwari.
* a description of how and who it will benefit in society,
 
* and a detailed work plan (including, if possible, a schedule with milestones and deliverables).
 
   
 
== How and who it will benefit in society ==
=== Work plan ===
 
  +
* The project will benefit the native speakers of this language and the people travelling to Rajasthan which is an Indian state where the most used language is Marwari. The state of Rajasthan is an important tourist attraction all over the world. It would also help tourists to communicate with local people of Rajasathan.
  +
* It will also help researchers in Natural Language Processing to carry out their research in Marwari.
  +
* The developers can use this project to create other language pairs which are closely related to Marwari.
  +
* In the long run, this project aims to reduce the language barrier which exists where people of two different regions find it difficult to communicate.
   
 
== Work plan ==
* Week 1:
 
  +
=== Community bonding period (May 4 - May 28): ===
* Week 2:
 
  +
* Getting introduced to the organization and community of Apertium.
* Week 3:
 
  +
* Understanding the code/projects which would be needed as a reference for my project.
* Week 4:
 
  +
* Discussing the project ideas and taking suggestions from the community regarding the implementation of the project.
  +
* Exploring and finding resources for Marwari.
  +
=== Work Period (May 29 - 28 Aug): ===
 
Week 1:
   
  +
*
* '''Deliverable #1'''
 
   
* Week 5:
+
Week 2:
* Week 6:
 
* Week 7:
 
* Week 8:
 
   
  +
*
* '''Deliverable #2'''
 
   
* Week 9:
+
Week 3:
* Week 10:
 
* Week 11:
 
* Week 12:
 
   
  +
*
* '''Project completed'''
 
   
 
Week 4:
Include time needed to think, to program, to document and to disseminate.
 
   
  +
*
If you are intending to disseminate to a conference, which conference are you intending to submit to. Make sure
 
  +
to factor in time taken to run any experiments/evaluations and write them up in your work plan.
 
 
'''Deliverable 1:'''
  +
 
Week 5:
  +
  +
*
  +
 
Week 6:
  +
  +
*
  +
 
Week 7:
  +
  +
*
  +
 
Week 8:
  +
  +
*
  +
  +
*
  +
*Prepare for the second evaluation
  +
 
'''Deliverable 2:'''
 
Week 9:
  +
  +
*
  +
 
Week 10:
  +
  +
*
  +
 
Week 11:
  +
  +
*
  +
 
Week 12:
  +
  +
*
  +
 
* '''Project completed'''
  +
I am a senior Computer Science undergraduate at Birla Institute of Technology and Science Pilani(BITS Pilani), India, which is an institute of Eminence. I have also done my internship at Ericsson where I build a NLP based ticket-classifier using python. I also developed a POS tagger for Hin-Eng code mixed dataset as a part of the Natural Language Processing coursework in my university.
   
 
List your skills and give evidence of your qualifications. Tell us what is your current field of study,
 
List your skills and give evidence of your qualifications. Tell us what is your current field of study,

Revision as of 06:09, 19 March 2023

Contact Information

Name: Hiten Vidhani

Location: India

University: Birla Institute of Technology and Science Pilani

Email address: vidhani.hiten2001@gmail.com

IRC: @hi101:matrix.org

Timezone: GMT+5:30

Github: hitenvidhani


Why is it that you are interested in Apertium?

Which of the published tasks are you interested in? What do you plan to do?

I am interested in the task "Bring an unreleased translation pair to releasable quality". I plan to develop the Marwari-Hindi(MWR-HIN) pair.

Proposal

Deliverables:

  • Creating the MWR-HIN bilingual dictionary.
  • Creating the MWR monolingual dictionary
  • Updating the HIN monolingual dictionary, if required.
  • Building the transfer rules for the MWR-HIN pair.
  • Creating a MWR-HIN translator.

Reasons why Google and Apertium should sponsor it:

  • Marwari has about 22 million speakers from India and neighboring countries of India. Despite the popularity the major translation tools like Google Translate don't include it.
  • The project adds diversity to Apertium by including Marwari
  • This project will be an important addition to the community, which could further be used to build projects or carry out research in low-resource languages which is a growing research area.
  • By releasing the first MWR-HIN translator open-source it will further benefit developers in building more related language pairs to Marwari.

How and who it will benefit in society

  • The project will benefit the native speakers of this language and the people travelling to Rajasthan which is an Indian state where the most used language is Marwari. The state of Rajasthan is an important tourist attraction all over the world. It would also help tourists to communicate with local people of Rajasathan.
  • It will also help researchers in Natural Language Processing to carry out their research in Marwari.
  • The developers can use this project to create other language pairs which are closely related to Marwari.
  • In the long run, this project aims to reduce the language barrier which exists where people of two different regions find it difficult to communicate.

Work plan

Community bonding period (May 4 - May 28):

  • Getting introduced to the organization and community of Apertium.
  • Understanding the code/projects which would be needed as a reference for my project.
  • Discussing the project ideas and taking suggestions from the community regarding the implementation of the project.
  • Exploring and finding resources for Marwari.

Work Period (May 29 - 28 Aug):

Week 1:

Week 2:

Week 3:

Week 4:

Deliverable 1:

Week 5:

Week 6:

Week 7:

Week 8:

  • Prepare for the second evaluation

Deliverable 2: Week 9:

Week 10:

Week 11:

Week 12:

  • Project completed

I am a senior Computer Science undergraduate at Birla Institute of Technology and Science Pilani(BITS Pilani), India, which is an institute of Eminence. I have also done my internship at Ericsson where I build a NLP based ticket-classifier using python. I also developed a POS tagger for Hin-Eng code mixed dataset as a part of the Natural Language Processing coursework in my university.

List your skills and give evidence of your qualifications. Tell us what is your current field of study, major, etc. Convince us that you can do the work.

List any non-Summer-of-Code plans you have for the Summer, especially employment, if you are applying for internships, and class-taking. Be specific about schedules and time commitments. we would like to be sure you have at least 30 free hours a week to develop for our project.