Difference between revisions of "User:Kiara/GSoC'16 Proposal"

From Apertium
Jump to navigation Jump to search
m
Line 1: Line 1:
Name: ''Kira Droganova''
Name: Kira Droganova
E-mail address: kira.droganova@gmail.com

Other information that may be useful to contact you: #apertium IRC channel: Kira (Kiara)
E-mail address: ''kira.droganova@gmail.com''

Other information that may be useful to contact you: ''#apertium IRC channel: Kira (Kiara)''



Why is it you are interested in machine translation?
Why is it you are interested in machine translation?


''I'm getting my Master's degree in Computational Linguistics in Higher School of Economics (Moscow) and I think that Machine translation is one of the most complex areas of computational linguistics. And at the same time it is one of the most practical tools. I like these features of machine translation. People really need MT tools in different areas of life and it means that the tools have to have a high quality.''
I'm getting my Master's degree in Computational Linguistics in Higher School of Economics (Moscow) and I think that Machine translation is one of the most complex areas of computational linguistics. And at the same time it is one of the most practical tools. I like these features of machine translation. People really need MT tools in different areas of life and it means that the tools have to have a high quality.



Why is it that you are interested in the Apertium project?
Why is it that you are interested in the Apertium project?


''I like the idea of Apertium. It is great that anyone has a chance to take part in this project. At first, It seems that it is impossible to start working in machine translation without any experience in this area. However, Apertium is greatly documented and the team always helps new people. Both things are very important to graduates and people who had just started to work in machine translation.''
I like the idea of Apertium. It is great that anyone has a chance to take part in this project. At first, It seems that it is impossible to start working in machine translation without any experience in this area. However, Apertium is greatly documented and the team always helps new people. Both things are very important to graduates and people who had just started to work in machine translation.
One of the greatest features is the ease of adaption of a new language pair. In my opinion, it is an extremely important feature of this project and I also like the idea of general rules for closely related languages.


Which of the published tasks are you interested in? What do you plan to do?
''One of the greatest features is the ease of adaption of a new language pair. In my opinion, it is an extremely important feature of this project and I also like the idea of general rules for closely related languages.''


I'm interested in Apertium website improvements tasks. I think, I can do all tasks, which are placed at the ides for GSoC page/ Apertium website improvements. However, it partly depends on the readiness of the back-end functionality. I think I can do both. Please, see the schedule details in my proposal.


Which of the published tasks are you interested in? What do you plan to do?


Apertium website improvements
''I'm interested in Apertium website improvements tasks. I think, I can do all tasks, which are placed at the ides for GSoC page/ Apertium website improvements. However, it partly depends on the readiness of the back-end functionality. I think I can do both. Please, see the schedule details in my proposal.''


New features provide benefits both to Apertium users and Apertium team.
Apertium website users will get the improved tool which provides a new dictionary lookup mode which is the second important feature after translation itself.
The feedback feature is important to Apertium team. Apertium team will be able to know more about Apertium from users and the tool obtains more testing from people who don't have technical background.
Both the feedback page and reliability visualisation make the site more user-friendly thus it will grow to one of the coolest online translation tools.
I am cool and highly motivated. I can develop many useful features in Apertium. If you help me to start in MT, I will not miss my chance.


I propose this schedule:


Preparation (22th of April - 22th of May):
'''Apertium website improvements'''
i. To ask mentors about 'must-know' information
ii. To learn how to use Tornado framework
iii. To inspect the html, css, bootstrap and js
iv. To inspect the python scripts
v. To try Language identification feature


Coding (25th of May - 23th of August):
New features provide benefits both to Apertium users and Apertium team.


Week 1: Feedback feature (Discussion and development)
Apertium website users will get the improved tool which provides a new dictionary lookup mode which is the second important feature after translation itself.
Week 2: "Dictionary lookup" mode (Discussion and back-end development, ranking algorithm development)
Week 3: "Dictionary lookup" mode: (Discussion and front-end development, bug fixing and testing)
Week 4: Language detection feature (Discussion and development)
Deliverable #1 : Feedback feature and Dictionary lookup feature
Week 5: Language detection: "did you mean" function
Week 6: Reliability visualisation: a translation color depends on how reliable it is (Discussion, algorithm and development)
Week 7: Reliability visualisation (bug fixing, testing and documenting )
Week 8: RBMT summer school
Deliverable #2: Language detection feature and Reliability visualisation feature
Week 9: RBMT summer school
Week 10: Webpage translation (Some buttons/labels are written only in English: Translate a document, Instant translation)
Week 11: Bug fix and documentation
Week 12: Bug fix and documentation
Project completed


List of technologies: python 3, html, css, jQuery, Bootstrap
The feedback feature is important to Apertium team. Apertium team will be able to know more about Apertium from users and the tool obtains more testing from people who don't have technical background.
List of projects:
1. Service which suggests Zaliznyak's grammatical indexes for "new Russian words".
http://web-corpora.net/wsgi3/GDictionary/
I developed back-end, front-end and some of Flask functions.
2. I trained a dependency parsing model for Russian with MaltParser and MyStem tagset.
My paper was published in Proceedings of the AINL-ISMW FRUCT:
Kira Droganova, Building a Dependency Parsing Model for Russian with MaltParser and MyStem Tagset
In Proceedings of the AINL-ISMW FRUCT, Saint-Petersburg, Russia, 9-14 November 2015, ITMO University, FRUCT, Finland. ISBN 978-5-7577-0493-7


3. Syntactic parser for Russian
Both the feedback page and reliability visualization
http://web-corpora.net/wsgi3/ru-syntax/
I trained a new syntactic model and improved the quality, prepared and tested segmentation rules and worked with quality metrics.
4. I am a member of Russian UD team. I am working on conversion rules for morphological tag sets now.
5. I also did Apertium coding challenges. I sent a pull request and a diff to Apertium website improvements mentors.
This is the link to my answer: https://github.com/Kira-D/apertium-html-tools/tree/GSoCChallenges

Revision as of 22:03, 14 March 2016

Name: Kira Droganova E-mail address: kira.droganova@gmail.com Other information that may be useful to contact you: #apertium IRC channel: Kira (Kiara)

Why is it you are interested in machine translation?

I'm getting my Master's degree in Computational Linguistics in Higher School of Economics (Moscow) and I think that Machine translation is one of the most complex areas of computational linguistics. And at the same time it is one of the most practical tools. I like these features of machine translation. People really need MT tools in different areas of life and it means that the tools have to have a high quality.

Why is it that you are interested in the Apertium project?

I like the idea of Apertium. It is great that anyone has a chance to take part in this project. At first, It seems that it is impossible to start working in machine translation without any experience in this area. However, Apertium is greatly documented and the team always helps new people. Both things are very important to graduates and people who had just started to work in machine translation. One of the greatest features is the ease of adaption of a new language pair. In my opinion, it is an extremely important feature of this project and I also like the idea of general rules for closely related languages.

Which of the published tasks are you interested in? What do you plan to do?

I'm interested in Apertium website improvements tasks. I think, I can do all tasks, which are placed at the ides for GSoC page/ Apertium website improvements. However, it partly depends on the readiness of the back-end functionality. I think I can do both. Please, see the schedule details in my proposal.


Apertium website improvements

New features provide benefits both to Apertium users and Apertium team. Apertium website users will get the improved tool which provides a new dictionary lookup mode which is the second important feature after translation itself. The feedback feature is important to Apertium team. Apertium team will be able to know more about Apertium from users and the tool obtains more testing from people who don't have technical background. Both the feedback page and reliability visualisation make the site more user-friendly thus it will grow to one of the coolest online translation tools. I am cool and highly motivated. I can develop many useful features in Apertium. If you help me to start in MT, I will not miss my chance.

I propose this schedule:

Preparation (22th of April - 22th of May): i. To ask mentors about 'must-know' information ii. To learn how to use Tornado framework iii. To inspect the html, css, bootstrap and js iv. To inspect the python scripts v. To try Language identification feature

Coding (25th of May - 23th of August):

Week 1: Feedback feature (Discussion and development) Week 2: "Dictionary lookup" mode (Discussion and back-end development, ranking algorithm development) Week 3: "Dictionary lookup" mode: (Discussion and front-end development, bug fixing and testing) Week 4: Language detection feature (Discussion and development) Deliverable #1 : Feedback feature and Dictionary lookup feature Week 5: Language detection: "did you mean" function Week 6: Reliability visualisation: a translation color depends on how reliable it is (Discussion, algorithm and development) Week 7: Reliability visualisation (bug fixing, testing and documenting ) Week 8: RBMT summer school Deliverable #2: Language detection feature and Reliability visualisation feature Week 9: RBMT summer school Week 10: Webpage translation (Some buttons/labels are written only in English: Translate a document, Instant translation) Week 11: Bug fix and documentation Week 12: Bug fix and documentation Project completed

List of technologies: python 3, html, css, jQuery, Bootstrap List of projects: 1. Service which suggests Zaliznyak's grammatical indexes for "new Russian words". http://web-corpora.net/wsgi3/GDictionary/ I developed back-end, front-end and some of Flask functions. 2. I trained a dependency parsing model for Russian with MaltParser and MyStem tagset. My paper was published in Proceedings of the AINL-ISMW FRUCT: Kira Droganova, Building a Dependency Parsing Model for Russian with MaltParser and MyStem Tagset In Proceedings of the AINL-ISMW FRUCT, Saint-Petersburg, Russia, 9-14 November 2015, ITMO University, FRUCT, Finland. ISBN 978-5-7577-0493-7

3. Syntactic parser for Russian http://web-corpora.net/wsgi3/ru-syntax/ I trained a new syntactic model and improved the quality, prepared and tested segmentation rules and worked with quality metrics. 4. I am a member of Russian UD team. I am working on conversion rules for morphological tag sets now. 5. I also did Apertium coding challenges. I sent a pull request and a diff to Apertium website improvements mentors. This is the link to my answer: https://github.com/Kira-D/apertium-html-tools/tree/GSoCChallenges