Difference between revisions of "Category talk:GSoC 2019 student proposals"

Latest revision as of 18:37, 25 March 2019

Apertium GSOC 2019

1 Morphological Analyzer of Braj Language
2 Contact Information
3 Skills And Experience
4 University Courses:
5 Technical Skills:
6 Why Interest In Apertium ?
7 Task And Plan ?
8 Reason Why Apertium And Google Sponsor It ?
9 Description Of How And Who It Will Benefit In Society ?
10 Work Plan

Morphological Analyzer of Braj Language[edit]

Contact Information[edit]

Name – Neerav Mathur

Location – Agra, Uttar Pradesh, India -282001

E-mail – nmathur54@gmail.com

Mobile no. - +919719009548 Github - [1]

Time Zone - UTC +5.30

Skills And Experience[edit]

My name is Neerav Mathur. I am 4th semester post graduate student at K.M.Institute of Hindi and Linguistics, Dr. Bhimrao Ambedkar University Agra, Uttar Pradesh, India. I am studying Linguistics and during my previous semesters I have learnt Python, Machine Translation, XML. As part of my previous semester course project, I trainer and tested MALT Parser for Magahi Language. In order to do it, me and my classmate Mohit also developed a small treebank for the language. In the current semester, we are further expanding the treebank and we plan to implement the first full-fledged parser for the language. I am also working on the development of a machine translation system for English-Magahi language pair. As you would notice, I am more generally interested in developing resources and technologies for under-resourced Indian languages. As part of this, I am interested in building a morphological analyzer for Braj Language (spoken in Braj Region of Agra, Mathura, Alighar, Bharatpur, etc in Uttar Pradesh and Rajasthan, India) using the Apertium platform. I am working on Digital Dictionary for Sema language, and I am interested in making of morphological analyzer for Braj Language. And this rise my more interest in NLP, Machine Learning etc.

University Courses:[edit]

Programing (Python, Xml, HTML)

Computer Tools for Linguistics Research

Linguistics Courses (Phonetics, Morphology, Syntax, Semantics, Field work, Sign Language, Machine Translation)

Theories of Machine Translation and Machine Translation (practical)

Technical Skills:[edit]

Programing Language – Python.

Web Design – HTML .

Databases – MySQL.

Languages – Hindi (Native), Braj, English.

== Why Interest In Machine Translation ? ==

I am studying Linguistics and during my previous semesters I have learnt Python, Machine Translation, XML. As part of my previous semester course project, I trainer and tested MALT Parser for Magahi Language. In order to do it, me and my classmate also developed a small treebank for the language. In the current semester, we are further expanding the treebank and we plan to implement the first full-fledged parser for the language. I am also working on the development of a machine translation system for English-Magahi language pair. As you would notice, I am more generally interested in developing resources and technologies for under-resourced Indian languages. During my fourth semester I attend Hands-on workshop on machine Translation where I get information about MT systems (Apertium) for under resourced languages and how morphological Analyzer help in increasing the performance of MT in rule based system. From there my interest get rised for MT.

Why Interest In Apertium ?[edit]

This organization works on things which are very interesting for me as a linguist & computational linguist: (rule-based) machine translation, languages, NLP and so on. I get more interested with Apertium when I get information in MT workshop that Apertium works with all kind of languages which is very important for support for all languages. Also Apertium community is very friendly and very helpful to new members, members here are always ready to help us ( Apertium community reply frequently on what ever query do we have ). It encourages me to work with Apertium.

Task And Plan ?[edit]

Task - I am interested in building a morphological analyzer for Braj Language (spoken in Braj Region of Agra, Mathura, Alighar, Bharatpur, etc in Uttar Pradesh and Rajasthan, India) using the Apertium platform.

Reason Why Apertium And Google Sponsor It ?[edit]

Braj is one of the most under-resourced Indian language. Which is spoken in Braj Region of Agra, Mathura, Alighar, Bharatpur, etc in Uttar Pradesh and Rajasthan, India. It is spoken by 1,556,314 native speaker (according 2011 census) . There is no Braj translator present online or offline and there is no rule-based translator with morphological analyzer. So I believe we can improve the quality of translation by applying rule-based model (Apertium).

Description Of How And Who It Will Benefit In Society ?[edit]

Firstly, Morphological Analyzers will help in increasing the performance of Machine translation in rule based system, especially for morphologically rich languages. I want to work on this project to make morphological analyzer for developing English - Braj MT system. Secondly, no such work has been done on MT for Braj Language, so my work will contribute to reduce the human work and improve the translation for Braj Language.

Work Plan[edit]

Week 1 - Preparing linguistic rule for Morphological analyzer.

Week 2 - Preparing linguistic rule for Morphological analyzer.

Week 3 - Tokenizing the data.

Week 4 - Prepare tag set.

Deliverable #1

Submit the Tokenized and prepared tagset

Week 5 - Preparing the affix list validate (Prepared Suffix list ) in corpus

Week 6 – Writing the program to develop Braj morphological analyzer.

Week 7 - Writing the program to develop Braj morphological analyzer.

Week 8 - Train and test the model.

Deliverable #2

Submit the program and trained, test model

Week 9 - Test the model with different domain of word.

Week 10 - Fixing the occurring error in model.

Week 11 -Again train and test the model.

Week 12 - Evaluation of results or model.

Project Completed Submission of project

@@ Line 1: / Line 1: @@
-Apertium GSOC 2019
+'''Apertium GSOC 2019'''
-Morphological Analyzer of Braj
+== Morphological Analyzer of Braj Language ==
-Contact Information
-Name – Neerav Mathur
-Location – Agra, Uttar Pradesh, India -282001
-E-mail – nmathur54@gmail.com
-Mobile no. - +919719009548
-Github - https://github.com/ommathur54
-Time Zone - UTC +5.30
-Skills And Experience
+== '''Contact Information''' ==
+'''Name''' – Neerav Mathur
+'''Location''' – Agra, Uttar Pradesh, India -282001
+'''E-mail''' – nmathur54@gmail.com
+'''Mobile no.''' - +919719009548
+'''
+Github''' - [https://github.com/ommathur54]
+'''Time Zone''' - UTC +5.30
+== '''Skills And Experience''' ==
 My name is Neerav Mathur. I am 4th semester post graduate student at K.M.Institute of Hindi and Linguistics, Dr. Bhimrao Ambedkar University Agra, Uttar Pradesh, India. I am studying Linguistics and during my previous semesters I have learnt Python, Machine Translation, XML. As part of my previous semester course project, I trainer and tested MALT Parser for Magahi Language. In order to do it, me and my classmate Mohit also developed a small treebank for the language. In the current semester, we are further expanding the treebank and we plan to implement the first full-fledged parser for the language. I am also working on the development of a machine translation system for English-Magahi language pair. As you would notice, I am more generally interested in developing resources and technologies for under-resourced Indian languages. As part of this, I am interested in building a morphological analyzer for Braj Language (spoken in Braj Region of Agra, Mathura, Alighar, Bharatpur, etc in Uttar Pradesh and Rajasthan, India) using the Apertium platform. I am working on Digital Dictionary for Sema language, and I am interested in making of morphological analyzer for Braj Language. And this rise my more interest in NLP, Machine Learning etc.
-University Courses:
+== '''University Courses:''' ==
 Programing (Python, Xml, HTML)
 Computer Tools for Linguistics Research
 Linguistics Courses (Phonetics, Morphology, Syntax, Semantics, Field work, Sign Language, Machine Translation)
 Theories of Machine Translation and Machine Translation (practical)
-Technical Skills:
+== '''Technical Skills:''' ==
 Programing Language – Python.
 Web Design – HTML .
 Databases – MySQL.
-Project and Experience :
 Languages – Hindi (Native), Braj, English.
-Interest In Machine Translation ?
+ == '''Why Interest In Machine Translation ?''' ==
 I am studying Linguistics and during my previous semesters I have learnt Python, Machine Translation, XML. As part of my previous semester course project, I trainer and tested MALT Parser for Magahi Language. In order to do it, me and my classmate also developed a small treebank for the language. In the current semester, we are further expanding the treebank and we plan to implement the first full-fledged parser for the language. I am also working on the development of a machine translation system for English-Magahi language pair. As you would notice, I am more generally interested in developing resources and technologies for under-resourced Indian languages. During my fourth semester I attend Hands-on workshop on machine Translation where I get information about MT systems (Apertium) for under resourced languages and how morphological Analyzer help in increasing the performance of MT in rule based system. From there my interest get rised for MT.
-Interest In Apertium ?
+== '''Why Interest In Apertium ?''' ==
 This organization works on things which are very interesting for me as a linguist & computational linguist: (rule-based) machine translation, languages, NLP and so on. I get more interested with Apertium  when I get information in MT workshop that Apertium works with all kind of languages which is very important for support for all languages.
 Also Apertium community is very friendly and very helpful to new members, members here are always ready to help us ( Apertium community reply frequently on what ever query do we have ). It encourages me to work with  Apertium.
+== '''Task And Plan ?''' ==
-Task And Plan ?
 Task - I am interested in building a morphological analyzer for Braj Language (spoken in Braj Region of Agra, Mathura, Alighar, Bharatpur, etc in Uttar Pradesh and Rajasthan, India) using the Apertium platform.
-Reason Why Apertium And Google Sponsor Me ?
+== '''Reason Why Apertium And Google Sponsor It ?''' ==
 Braj is one of the most under-resourced Indian language. Which is spoken in Braj Region of Agra, Mathura, Alighar, Bharatpur, etc in Uttar Pradesh and Rajasthan, India. It is spoken by 1,556,314  native speaker (according 2011 census) . There is no Braj translator present online or offline and there is no rule-based translator with morphological analyzer. So I believe we can improve the quality of translation by applying rule-based model (Apertium).
-Description Of How And  Who It Will Benefit In Society ?
+== '''Description Of How And  Who It Will Benefit In Society ?''' ==
 Firstly, Morphological Analyzers will help in increasing the performance of Machine translation in rule based system, especially for morphologically rich languages. I want to work on this project to make morphological analyzer for developing English - Braj MT system.
 Secondly, no such work has been done on MT for Braj Language, so my work will contribute to reduce the human work and improve the translation for Braj Language.
-Work Plan
+== '''Work Plan''' ==
-Week 1 -  Preparing linguistic rule for Morphological analyzer
+'''Week 1''' -  Preparing linguistic rule for Morphological analyzer.
-Week 2 -  Preparing linguistic rule for Morphological analyzer
+'''Week 2''' -  Preparing linguistic rule for Morphological analyzer.
-Week 3 -   Tokenizing the data
-Week 4 -  Prepare tag set.
+'''Week 3''' -   Tokenizing the data.
-Deliverable #1
+'''Week 4''' -  Prepare tag set.
+'''Deliverable #1'''
 Submit the Tokenized and prepared tagset
-Week 5 -  Preparing the affix list validate (Prepared Suffix list ) in corpus
+'''Week 5''' -  Preparing the affix list validate (Prepared Suffix list ) in corpus
-Week 6 – Writing the program to develop Magahi morphological analyzer.
-Week 7 - Writing the program to develop Magahi morphological analyzer
+'''Week 6''' – Writing the program to develop Braj morphological analyzer.
+'''Week 7''' - Writing the program to develop Braj morphological analyzer.
+'''Week 8''' -  Train and test the model.
+'''Deliverable #2'''
-Week 8 -  Train and test the model
-Deliverable #2
 Submit the program and trained, test model
-Week 9 - Test the model with different domain of word.
-Week 10 -  Fixing the occurring error in model.
-Week 11 -Again train and test the model
-Week 12 - Evaluation of results or model
+'''Week 9''' - Test the model with different domain of word.
-Project Completed
+'''Week 10''' -  Fixing the occurring error in model.
+'''Week 11''' -Again train and test the model.
+'''Week 12''' - Evaluation of results or model.
+'''Project Completed'''
 Submission of project

Difference between revisions of "Category talk:GSoC 2019 student proposals"

Latest revision as of 18:37, 25 March 2019

Contents

Morphological Analyzer of Braj Language[edit]

Contact Information[edit]

Skills And Experience[edit]

University Courses:[edit]

Technical Skills:[edit]

Why Interest In Apertium ?[edit]

Task And Plan ?[edit]

Reason Why Apertium And Google Sponsor It ?[edit]

Description Of How And Who It Will Benefit In Society ?[edit]

Work Plan[edit]

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools