User:Kvld/Proposal

From Apertium
Revision as of 12:29, 2 March 2016 by Kvld (talk | contribs) (Created page with "==Contact information== *'''Name:''' Vladislav Kiryukhin *'''E-mail:''' kiryukhinv(at)gmail.com *'''IRC:''' kvld ==Why is it you are interested in machine translation?== Mach...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Contact information

  • Name: Vladislav Kiryukhin
  • E-mail: kiryukhinv(at)gmail.com
  • IRC: kvld

Why is it you are interested in machine translation?

Machine translation involves linguistics and programming, in which I am interested in. It's very important area that enables people to have information about a variety of things from different parts of world across a language barrier. Special value MT has for lesser-known languages, when human translators are not available.

Why is it that you are interested in the Apertium project?

I had completed a few small tasks for Apertium during Google Code-In 2012. During that Google Code-In I was a high school student and now that I'm over 18 and study at the university I can dedicate time for more significant contribution.

Which of the published tasks are you interested in? What do you plan to do?

Title

New Belarusian <-> Russian language pair

Reasons why Google and Apertium should sponsor it

Currently Apertium has no any language pairs with Belarusian. My plan is complete the bel-ru pair and bring it to the release quality.

A description of how and who it will benefit in society

Performing this will give free and open source translation system from Belarusian to Russian. Both languages are official in Belarus and automation of translation may help some people in different situations and save a lot of time.
Also this language marked as "vulnerable" on UNESCO list of endangered languages and any projects in Belarusian may help to popularize it.

Work plan

Community bonding period:

  • Getting closer with Apertium,
  • Reading available documentation and studiyng the existing pairs,
  • Finding any language resourses and dictionaries which can be used,
  • Checking the existing Belarusian files in Incubator.

Work period:

Week Target
Week 1 Write parsers for dictionaries and transform parsed data to Apertium dictionary formats.
Week 2 Add and check nouns, pronouns and numerals.
Week 3 Add and check nouns, pronouns and numerals.
Week 4 Add and check conjunctions.
Add necessary bel-ru transfer rules.
Deliverable #1 Updated bel monodix, bel-ru bidix and some bel-ru transfer rules.
Week 5 Add and check adverbs.
Week 6 Add and check verbs.
Week 7 (midterm) Add and check verbs. Start adding adjectives.
Week 8 Add and check adjectives.
Add bel-ru transfer rules.
Deliverable #2 Almost finished bel monodix, bel-ru bidix and bel-ru transfer rules.
Week 9 Extend word coverage.
Adjust transfer rules as necessary.
Run testvoc.
Week 10 Extend word coverage.
Adjust transfer rules as necessary.
Run testvoc.
Week 11 Extend word coverage.
Adjust transfer rules as necessary.
Run testvoc.
Week 12 Write documentation and cleanup code.
Projection completed Finished language pair.

List your skills and give evidence of your qualifications

I'm currently a 2nd year bachelor student in Saint Petersburg University ITMO (Russia).
Languages: Russian (Native), Belarusian (Good), basic knowledge in Polish.
Programming skills: C, C++, Java, Python and some scripting languages. Basic knowledge in Machine Learning.

List any non-Summer-of-Code plans you have for the Summer

I have no non-GSoC plans for the summer and I can spend about 50 hours a week on task.