Difference between revisions of "User:Kiara/GSoC'16 Proposal"

From Apertium
Jump to navigation Jump to search
 
(7 intermediate revisions by one other user not shown)
Line 1: Line 1:
 
Name: Kira Droganova
 
Name: Kira Droganova
  +
 
E-mail address: kira.droganova@gmail.com
 
E-mail address: kira.droganova@gmail.com
  +
 
Other information that may be useful to contact you: #apertium IRC channel: Kira (Kiara)
 
Other information that may be useful to contact you: #apertium IRC channel: Kira (Kiara)
  +
   
 
Why is it you are interested in machine translation?
 
Why is it you are interested in machine translation?
   
 
I'm getting my Master's degree in Computational Linguistics in Higher School of Economics (Moscow) and I think that Machine translation is one of the most complex areas of computational linguistics. And at the same time it is one of the most practical tools. I like these features of machine translation. People really need MT tools in different areas of life and it means that the tools have to have a high quality.
 
I'm getting my Master's degree in Computational Linguistics in Higher School of Economics (Moscow) and I think that Machine translation is one of the most complex areas of computational linguistics. And at the same time it is one of the most practical tools. I like these features of machine translation. People really need MT tools in different areas of life and it means that the tools have to have a high quality.
  +
   
 
Why is it that you are interested in the Apertium project?
 
Why is it that you are interested in the Apertium project?
Line 11: Line 15:
 
I like the idea of Apertium. It is great that anyone has a chance to take part in this project. At first, It seems that it is impossible to start working in machine translation without any experience in this area. However, Apertium is greatly documented and the team always helps new people. Both things are very important to graduates and people who had just started to work in machine translation.
 
I like the idea of Apertium. It is great that anyone has a chance to take part in this project. At first, It seems that it is impossible to start working in machine translation without any experience in this area. However, Apertium is greatly documented and the team always helps new people. Both things are very important to graduates and people who had just started to work in machine translation.
 
One of the greatest features is the ease of adaption of a new language pair. In my opinion, it is an extremely important feature of this project and I also like the idea of general rules for closely related languages.
 
One of the greatest features is the ease of adaption of a new language pair. In my opinion, it is an extremely important feature of this project and I also like the idea of general rules for closely related languages.
  +
   
 
Which of the published tasks are you interested in? What do you plan to do?
 
Which of the published tasks are you interested in? What do you plan to do?
   
 
I'm interested in Apertium website improvements tasks. I think, I can do all tasks, which are placed at the ides for GSoC page/ Apertium website improvements. However, it partly depends on the readiness of the back-end functionality. I think I can do both. Please, see the schedule details in my proposal.
 
I'm interested in Apertium website improvements tasks. I think, I can do all tasks, which are placed at the ides for GSoC page/ Apertium website improvements. However, it partly depends on the readiness of the back-end functionality. I think I can do both. Please, see the schedule details in my proposal.
  +
   
   
Line 20: Line 26:
   
 
New features provide benefits both to Apertium users and Apertium team.
 
New features provide benefits both to Apertium users and Apertium team.
  +
 
Apertium website users will get the improved tool which provides a new dictionary lookup mode which is the second important feature after translation itself.
 
Apertium website users will get the improved tool which provides a new dictionary lookup mode which is the second important feature after translation itself.
  +
 
The feedback feature is important to Apertium team. Apertium team will be able to know more about Apertium from users and the tool obtains more testing from people who don't have technical background.
 
The feedback feature is important to Apertium team. Apertium team will be able to know more about Apertium from users and the tool obtains more testing from people who don't have technical background.
  +
 
Both the feedback page and reliability visualisation make the site more user-friendly thus it will grow to one of the coolest online translation tools.
 
Both the feedback page and reliability visualisation make the site more user-friendly thus it will grow to one of the coolest online translation tools.
  +
 
I am cool and highly motivated. I can develop many useful features in Apertium. If you help me to start in MT, I will not miss my chance.
 
I am cool and highly motivated. I can develop many useful features in Apertium. If you help me to start in MT, I will not miss my chance.
   
I propose this schedule:
 
   
 
'''I propose this schedule:'''
Preparation (22th of April - 22th of May):
 
  +
 
'''Preparation (22th of April - 22th of May):'''
   
 
i. To ask mentors about 'must-know' information
 
i. To ask mentors about 'must-know' information
Line 40: Line 51:
   
   
Coding (25th of May - 23th of August):
+
'''Coding (25th* of May - 23th of August):'''
   
Week 1: Feedback feature (Discussion and development)
+
Week 1: Language detection feature (Discussion and development)
   
  +
Week 2: Language detection: "did you mean" function
Week 2: "Dictionary lookup" mode (Discussion and back-end development, ranking algorithm development)
 
   
Week 3: "Dictionary lookup" mode: (Discussion and front-end development, bug fixing and testing)
+
Week 3: "Dictionary lookup" mode (Discussion and back-end development, ranking algorithm development)
   
Week 4: Language detection feature (Discussion and development)
+
Week 4: "Dictionary lookup" mode: (Discussion and front-end development, bug fixing and testing)
   
Deliverable #1 : Feedback feature and Dictionary lookup feature
+
Deliverable #1 : Language detection feature and Dictionary lookup feature
   
Week 5: Language detection: "did you mean" function
+
Week 5: Feedback feature (Discussion and development)
   
 
Week 6: Reliability visualisation: a translation color depends on how reliable it is (Discussion, algorithm and development)
 
Week 6: Reliability visualisation: a translation color depends on how reliable it is (Discussion, algorithm and development)
Line 60: Line 71:
 
Week 8: RBMT summer school
 
Week 8: RBMT summer school
   
Deliverable #2: Language detection feature and Reliability visualisation feature
+
Deliverable #2: Feedback feature and Reliability visualisation feature
   
 
Week 9: RBMT summer school
 
Week 9: RBMT summer school
Line 71: Line 82:
   
 
Project completed
 
Project completed
  +
  +
_* I have to finish my thesis by the end of this week (23th - 27th of May). I'll do my best to finish it asap.
   
   
 
List of technologies: python 3, html, css, jQuery, Bootstrap
 
List of technologies: python 3, html, css, jQuery, Bootstrap
  +
 
List of projects:
 
List of projects:
  +
 
1. Service which suggests Zaliznyak's grammatical indexes for "new Russian words".
 
1. Service which suggests Zaliznyak's grammatical indexes for "new Russian words".
  +
 
http://web-corpora.net/wsgi3/GDictionary/
 
http://web-corpora.net/wsgi3/GDictionary/
  +
 
I developed back-end, front-end and some of Flask functions.
 
I developed back-end, front-end and some of Flask functions.
  +
 
2. I trained a dependency parsing model for Russian with MaltParser and MyStem tagset.
 
2. I trained a dependency parsing model for Russian with MaltParser and MyStem tagset.
  +
 
My paper was published in Proceedings of the AINL-ISMW FRUCT:
 
My paper was published in Proceedings of the AINL-ISMW FRUCT:
  +
 
Kira Droganova, Building a Dependency Parsing Model for Russian with MaltParser and MyStem Tagset
 
Kira Droganova, Building a Dependency Parsing Model for Russian with MaltParser and MyStem Tagset
 
In Proceedings of the AINL-ISMW FRUCT, Saint-Petersburg, Russia, 9-14 November 2015, ITMO University, FRUCT, Finland. ISBN 978-5-7577-0493-7
 
In Proceedings of the AINL-ISMW FRUCT, Saint-Petersburg, Russia, 9-14 November 2015, ITMO University, FRUCT, Finland. ISBN 978-5-7577-0493-7
   
 
3. Syntactic parser for Russian
 
3. Syntactic parser for Russian
  +
 
http://web-corpora.net/wsgi3/ru-syntax/
 
http://web-corpora.net/wsgi3/ru-syntax/
 
I trained a new syntactic model and improved the quality, prepared and tested segmentation rules and worked with quality metrics.
 
I trained a new syntactic model and improved the quality, prepared and tested segmentation rules and worked with quality metrics.
  +
 
4. I am a member of Russian UD team. I am working on conversion rules for morphological tag sets now.
 
4. I am a member of Russian UD team. I am working on conversion rules for morphological tag sets now.
  +
 
5. I also did Apertium coding challenges. I sent a pull request and a diff to Apertium website improvements mentors.
 
5. I also did Apertium coding challenges. I sent a pull request and a diff to Apertium website improvements mentors.
  +
This is the link to my answer: https://github.com/Kira-D/apertium-html-tools/tree/GSoCChallenges
+
This is the link: https://github.com/apertium/apertium-html-tools/pull/53

Latest revision as of 01:43, 8 March 2018

Name: Kira Droganova

E-mail address: kira.droganova@gmail.com

Other information that may be useful to contact you: #apertium IRC channel: Kira (Kiara)


Why is it you are interested in machine translation?

I'm getting my Master's degree in Computational Linguistics in Higher School of Economics (Moscow) and I think that Machine translation is one of the most complex areas of computational linguistics. And at the same time it is one of the most practical tools. I like these features of machine translation. People really need MT tools in different areas of life and it means that the tools have to have a high quality.


Why is it that you are interested in the Apertium project?

I like the idea of Apertium. It is great that anyone has a chance to take part in this project. At first, It seems that it is impossible to start working in machine translation without any experience in this area. However, Apertium is greatly documented and the team always helps new people. Both things are very important to graduates and people who had just started to work in machine translation. One of the greatest features is the ease of adaption of a new language pair. In my opinion, it is an extremely important feature of this project and I also like the idea of general rules for closely related languages.


Which of the published tasks are you interested in? What do you plan to do?

I'm interested in Apertium website improvements tasks. I think, I can do all tasks, which are placed at the ides for GSoC page/ Apertium website improvements. However, it partly depends on the readiness of the back-end functionality. I think I can do both. Please, see the schedule details in my proposal.


Apertium website improvements

New features provide benefits both to Apertium users and Apertium team.

Apertium website users will get the improved tool which provides a new dictionary lookup mode which is the second important feature after translation itself.

The feedback feature is important to Apertium team. Apertium team will be able to know more about Apertium from users and the tool obtains more testing from people who don't have technical background.

Both the feedback page and reliability visualisation make the site more user-friendly thus it will grow to one of the coolest online translation tools.

I am cool and highly motivated. I can develop many useful features in Apertium. If you help me to start in MT, I will not miss my chance.


I propose this schedule:

Preparation (22th of April - 22th of May):

i. To ask mentors about 'must-know' information

ii. To learn how to use Tornado framework

iii. To inspect the html, css, bootstrap and js

iv. To inspect the python scripts

v. To try Language identification feature


Coding (25th* of May - 23th of August):

Week 1: Language detection feature (Discussion and development)

Week 2: Language detection: "did you mean" function

Week 3: "Dictionary lookup" mode (Discussion and back-end development, ranking algorithm development)

Week 4: "Dictionary lookup" mode: (Discussion and front-end development, bug fixing and testing)

Deliverable #1 : Language detection feature and Dictionary lookup feature

Week 5: Feedback feature (Discussion and development)

Week 6: Reliability visualisation: a translation color depends on how reliable it is (Discussion, algorithm and development)

Week 7: Reliability visualisation (bug fixing, testing and documenting )

Week 8: RBMT summer school

Deliverable #2: Feedback feature and Reliability visualisation feature

Week 9: RBMT summer school

Week 10: Webpage translation (Some buttons/labels are written only in English: Translate a document, Instant translation)

Week 11: Bug fix and documentation

Week 12: Bug fix and documentation

Project completed

_* I have to finish my thesis by the end of this week (23th - 27th of May). I'll do my best to finish it asap.


List of technologies: python 3, html, css, jQuery, Bootstrap

List of projects:

1. Service which suggests Zaliznyak's grammatical indexes for "new Russian words".

http://web-corpora.net/wsgi3/GDictionary/

I developed back-end, front-end and some of Flask functions.

2. I trained a dependency parsing model for Russian with MaltParser and MyStem tagset.

My paper was published in Proceedings of the AINL-ISMW FRUCT:

Kira Droganova, Building a Dependency Parsing Model for Russian with MaltParser and MyStem Tagset In Proceedings of the AINL-ISMW FRUCT, Saint-Petersburg, Russia, 9-14 November 2015, ITMO University, FRUCT, Finland. ISBN 978-5-7577-0493-7

3. Syntactic parser for Russian

http://web-corpora.net/wsgi3/ru-syntax/ I trained a new syntactic model and improved the quality, prepared and tested segmentation rules and worked with quality metrics.

4. I am a member of Russian UD team. I am working on conversion rules for morphological tag sets now.

5. I also did Apertium coding challenges. I sent a pull request and a diff to Apertium website improvements mentors.

This is the link: https://github.com/apertium/apertium-html-tools/pull/53