Difference between revisions of "User:Quirille/GSOC proposal 2013"

From Apertium
Jump to navigation Jump to search
(Created page with '== Contact information == '''Name:''' Krylov Kirill '''E-mail address:''' knpnvv[at]gmail.com '''IRC:''' quirille Other contact information can be provided to the mentor. ==…')
 
m (email)
 
(3 intermediate revisions by the same user not shown)
Line 3: Line 3:
'''Name:''' Krylov Kirill
'''Name:''' Krylov Kirill


'''Email:''' [http://www.google.com/recaptcha/mailhide/d?k=012JTfjDbk89pUViYaA3SWqg==&c=ib1ekxgEboVKLfjNenr2Jd8SZW0Gbcsbw0x1lmJiISM= knp...@gmail.com]
'''E-mail address:''' knpnvv[at]gmail.com


'''IRC:''' quirille
'''IRC:''' quirille
Line 39: Line 39:
* Studying testvocing
* Studying testvocing
* Studying the existing ru-uk monodices, bidix and transfer rules
* Studying the existing ru-uk monodices, bidix and transfer rules
* Studying the existing Russian monodices


'''Work Period (June 17 - September 15)'''
'''Work Period (June 17 - September 15)'''


'''Week 1:'''
Week 1:
* Start extending ukrainian monodix to the size of Russian, adding new entries to bidix and adding necessary uk-ru transfer rules.
* Start working on ru&uk monodices
* Check and add conjunctions and prepositions to uk monodix
'''Week 2:'''
Week 2:
* Continue working on ru&uk monodices
* Check and add adverbs to uk monodix
'''Week 3:'''
Week 3:
* Continue working on ru&uk monodices
* Check and add numerals to uk monodix
'''Week 4:'''
Week 4:
* Checking up ru&uk monodices
* Check and add pronouns and determiners to uk monodix
'''Deliverable #1:''' updated ru&uk monodices, coordinated ru monodices
'''Deliverable #1:''' updated uk monodix, ru-uk bidix and ru-uk transfer rules


'''Week 5:'''
Week 5:
* Check and add nouns to uk monodix
* Start working on ru-uk bidix
'''Week 6:'''
Week 6:
* Check and add nouns to uk monodix
* Continue working on ru-uk bidix
'''Week 7 (Midterm July 29 - August 2):'''
Week 7 (Midterm July 29 - August 2):
* Check and add adjectives to uk monodix
* Checking up ru-uk bidix
'''Deliverable #2:''' updated bidix
'''Deliverable #2:''' updated uk monodix, ru-uk bidix and ru-uk transfer rules


'''Week 8:'''
Week 8:
* Check and add adjectives to uk monodix
* Start working on ru-uk transfer rules
'''Week 9:'''
Week 9:
* Check and add verbs to uk monodix
* Continue working on ru-uk transfer rules
'''Week 10:'''
Week 10:
* Check and add verbs to uk monodix
* Continue working on ru-uk transfer rules
'''Deliverable #3:''' finished uk monodix, ru-uk bidix and ru-uk transfer rules
'''Week 11:'''
* Checking up ru-uk transfer rules
'''Deliverable #3:''' updated ru-uk transfer rules


'''Week 12:'''
Week 11:
* testing
* testvocing
'''Week 13:'''
Week 12:
* testing
* testvocing
Week 13:
* testing


'''Project completion (September 16 - September 23)'''
Project completion (September 16 - September 23):
* Tidying up, releasing


'''Final evaluation (September23- September 27)'''
Final evaluation (September23- September 27)


== List your skills and give evidence of your qualifications ==
== List your skills and give evidence of your qualifications ==
Line 95: Line 96:


I have no non-GSoC plans for the summer and can contribute from 30 to 40 hours a week. However I have exams in the institute from the 3d of June till the 21st of June, and the next term starts at the 2nd of September. So I will start the community bonding period earlier.
I have no non-GSoC plans for the summer and can contribute from 30 to 40 hours a week. However I have exams in the institute from the 3d of June till the 21st of June, and the next term starts at the 2nd of September. So I will start the community bonding period earlier.

[[Category:GSoC_2013_Student_proposals|Quirille]]

Latest revision as of 19:03, 21 March 2014

Contact information[edit]

Name: Krylov Kirill

Email: knp...@gmail.com

IRC: quirille

Other contact information can be provided to the mentor.

Why is it you are interested in machine translation?[edit]

I am very interested in both linguistics and computer science which are the main constituents of machine translation. In school I had 10 years in-depth courses of English and Russian. They were one of my favorite subjects and I examined many linguistic issues (concerned not only Russian and English). Although in the university I mostly make study of programming and computer science, I keep up my passion for linguistics. I find the fields of natural language processing and machine translation very attractive and prospective and want to specialize in them.

Why is it that you are interested in the Apertium project?[edit]

The Apertium project could give me the opportunity to be engaged in the field of machine translation. In addition, Apertium is open source which is very interesting approach to the software development. Also Apertium has many tasks which are so amazing to be realized.

Which of the published tasks are you interested in? What do you plan to do?[edit]

Title[edit]

Ukrainian-Russian language pair for unidirectional translation from Ukrainian to Russian

Reasons why Google and Apertium should sponsor it[edit]

Currently Apertium has no release quality language pair with Russian and there is uncompleted Ukrainian-Russian language pair in the incubator. It should be brought to the release quality. Also there are uncoordinated morphological and morphophonological files for Russian in the different catalogues, they should be arranged.

A description of how and who it will benefit in society[edit]

Performing this task will give free and open source translation system from Ukrainian to Russian. It will help to support the language diversity in Russia and Ukraine. Ukrainian and Russian are the two most spoken languages in Ukraine, so automation of translation will help to save a lot of time. Also getting this translation pair may extend contacts between Russian speaking and Ukrainian speaking people.

Work plan[edit]

Community bonding period (May 27 - June 16):

  • Getting closer with Apertium tools and community
  • Finding the language resources for Ukrainian and Russian
  • Studying testvocing
  • Studying the existing ru-uk monodices, bidix and transfer rules

Work Period (June 17 - September 15)

Week 1:

  • Start extending ukrainian monodix to the size of Russian, adding new entries to bidix and adding necessary uk-ru transfer rules.
  • Check and add conjunctions and prepositions to uk monodix

Week 2:

  • Check and add adverbs to uk monodix

Week 3:

  • Check and add numerals to uk monodix

Week 4:

  • Check and add pronouns and determiners to uk monodix

Deliverable #1: updated uk monodix, ru-uk bidix and ru-uk transfer rules

Week 5:

  • Check and add nouns to uk monodix

Week 6:

  • Check and add nouns to uk monodix

Week 7 (Midterm July 29 - August 2):

  • Check and add adjectives to uk monodix

Deliverable #2: updated uk monodix, ru-uk bidix and ru-uk transfer rules

Week 8:

  • Check and add adjectives to uk monodix

Week 9:

  • Check and add verbs to uk monodix

Week 10:

  • Check and add verbs to uk monodix

Deliverable #3: finished uk monodix, ru-uk bidix and ru-uk transfer rules

Week 11:

  • testing

Week 12:

  • testing

Week 13:

  • testing

Project completion (September 16 - September 23):

  • Tidying up, releasing

Final evaluation (September23- September 27)

List your skills and give evidence of your qualifications[edit]

I am on the 4th (last but one) year of the spetsialist (специалист, russian degree between Bachelor's and Master's) degree in Computer Science and Engineering at the Institute of Management and Information Technologies of the Saint Petersburg State Polytechnical University (Russia).

I am native speaker of Russian. As Ukrainian is close to Russian I can understand it. Also I am able to find out morphological and syntactical peculiarities of Ukrainian.

Programming skills: C, C++, C# and .NET, Matlab, Python, git. I am ready to learn Perl (if necessary).

In the institute I had courses of Machine Learning and Automata Theory. I think knowledge of them will help me to understand Apertium more deep, especially Finite State Transducers. Also I have done some works concerned NLP during my studies. As a course paper of Machine Learning discipline I wrote text attribution program in Matlab based on Bag of Words approach and machine learning algorithms (using libraries randomforest-matlab by Abhishek Jaiantilal and libsvm). As a course paper of Machine Vision discipline I wrote C# program for image classification based on Bag of Words model and SVM algorithm (using EmguCV – C# wrapper of OpenCV).

During last year I worked in company Mallenom Systems attached to our institute as a tester in 2 projects: traffic simulation system Road Manager and program complex Automated rolling stock car identification system ARSCIS. This job gave me team-working skills, knowledge of such a great program as git and helped me to look at the programmers’ job “from the other side of the barricade”.

List any non-Summer-of-Code plans you have for the Summer[edit]

I have no non-GSoC plans for the summer and can contribute from 30 to 40 hours a week. However I have exams in the institute from the 3d of June till the 21st of June, and the next term starts at the 2nd of September. So I will start the community bonding period earlier.