Difference between revisions of "User:Quirille/GSOC proposal 2013"
(Created page with '== Contact information == '''Name:''' Krylov Kirill '''E-mail address:''' knpnvv[at]gmail.com '''IRC:''' quirille Other contact information can be provided to the mentor. ==…') |
|||
Line 95: | Line 95: | ||
I have no non-GSoC plans for the summer and can contribute from 30 to 40 hours a week. However I have exams in the institute from the 3d of June till the 21st of June, and the next term starts at the 2nd of September. So I will start the community bonding period earlier. |
I have no non-GSoC plans for the summer and can contribute from 30 to 40 hours a week. However I have exams in the institute from the 3d of June till the 21st of June, and the next term starts at the 2nd of September. So I will start the community bonding period earlier. |
||
[[Category:GSoC_2013_Student_proposals|Quirille]] |
Revision as of 10:19, 3 May 2013
Contents
- 1 Contact information
- 2 Why is it you are interested in machine translation?
- 3 Why is it that you are interested in the Apertium project?
- 4 Which of the published tasks are you interested in? What do you plan to do?
- 5 List your skills and give evidence of your qualifications
- 6 List any non-Summer-of-Code plans you have for the Summer
Contact information
Name: Krylov Kirill
E-mail address: knpnvv[at]gmail.com
IRC: quirille
Other contact information can be provided to the mentor.
Why is it you are interested in machine translation?
I am very interested in both linguistics and computer science which are the main constituents of machine translation. In school I had 10 years in-depth courses of English and Russian. They were one of my favorite subjects and I examined many linguistic issues (concerned not only Russian and English). Although in the university I mostly make study of programming and computer science, I keep up my passion for linguistics. I find the fields of natural language processing and machine translation very attractive and prospective and want to specialize in them.
Why is it that you are interested in the Apertium project?
The Apertium project could give me the opportunity to be engaged in the field of machine translation. In addition, Apertium is open source which is very interesting approach to the software development. Also Apertium has many tasks which are so amazing to be realized.
Which of the published tasks are you interested in? What do you plan to do?
Title
Ukrainian-Russian language pair for unidirectional translation from Ukrainian to Russian
Reasons why Google and Apertium should sponsor it
Currently Apertium has no release quality language pair with Russian and there is uncompleted Ukrainian-Russian language pair in the incubator. It should be brought to the release quality. Also there are uncoordinated morphological and morphophonological files for Russian in the different catalogues, they should be arranged.
A description of how and who it will benefit in society
Performing this task will give free and open source translation system from Ukrainian to Russian. It will help to support the language diversity in Russia and Ukraine. Ukrainian and Russian are the two most spoken languages in Ukraine, so automation of translation will help to save a lot of time. Also getting this translation pair may extend contacts between Russian speaking and Ukrainian speaking people.
Work plan
Community bonding period (May 27 - June 16):
- Getting closer with Apertium tools and community
- Finding the language resources for Ukrainian and Russian
- Studying testvocing
- Studying the existing ru-uk monodices, bidix and transfer rules
- Studying the existing Russian monodices
Work Period (June 17 - September 15)
Week 1:
- Start working on ru&uk monodices
Week 2:
- Continue working on ru&uk monodices
Week 3:
- Continue working on ru&uk monodices
Week 4:
- Checking up ru&uk monodices
Deliverable #1: updated ru&uk monodices, coordinated ru monodices
Week 5:
- Start working on ru-uk bidix
Week 6:
- Continue working on ru-uk bidix
Week 7 (Midterm July 29 - August 2):
- Checking up ru-uk bidix
Deliverable #2: updated bidix
Week 8:
- Start working on ru-uk transfer rules
Week 9:
- Continue working on ru-uk transfer rules
Week 10:
- Continue working on ru-uk transfer rules
Week 11:
- Checking up ru-uk transfer rules
Deliverable #3: updated ru-uk transfer rules
Week 12:
- testvocing
Week 13:
- testvocing
Project completion (September 16 - September 23)
Final evaluation (September23- September 27)
List your skills and give evidence of your qualifications
I am on the 4th (last but one) year of the spetsialist (специалист, russian degree between Bachelor's and Master's) degree in Computer Science and Engineering at the Institute of Management and Information Technologies of the Saint Petersburg State Polytechnical University (Russia).
I am native speaker of Russian. As Ukrainian is close to Russian I can understand it. Also I am able to find out morphological and syntactical peculiarities of Ukrainian.
Programming skills: C, C++, C# and .NET, Matlab, Python, git. I am ready to learn Perl (if necessary).
In the institute I had courses of Machine Learning and Automata Theory. I think knowledge of them will help me to understand Apertium more deep, especially Finite State Transducers. Also I have done some works concerned NLP during my studies. As a course paper of Machine Learning discipline I wrote text attribution program in Matlab based on Bag of Words approach and machine learning algorithms (using libraries randomforest-matlab by Abhishek Jaiantilal and libsvm). As a course paper of Machine Vision discipline I wrote C# program for image classification based on Bag of Words model and SVM algorithm (using EmguCV – C# wrapper of OpenCV).
During last year I worked in company Mallenom Systems attached to our institute as a tester in 2 projects: traffic simulation system Road Manager and program complex Automated rolling stock car identification system ARSCIS. This job gave me team-working skills, knowledge of such a great program as git and helped me to look at the programmers’ job “from the other side of the barricade”.
List any non-Summer-of-Code plans you have for the Summer
I have no non-GSoC plans for the summer and can contribute from 30 to 40 hours a week. However I have exams in the institute from the 3d of June till the 21st of June, and the next term starts at the 2nd of September. So I will start the community bonding period earlier.