User:GD/proposal

From Apertium
Jump to navigation Jump to search

Contact information

Name: Evgenii Glazunov

Location: Moscow, Russia

University: NRU HSE, Moscow (National Research University Higher School of Economics), 3rd-year student

E-mail: glaz.dikobraz@gmail.com

IRC: G_D

Timezone: UTC+3

Github: https://github.com/dkbrz

Am I good enough?

Education: Bachelor's Degree in Fundamental and Computational Linguistics (2015-2019) at NRU HSE

Courses:

  • Programming (Python, R, Flask, HTML,xml, Machine Learning)
  • Morphology, Syntax, Semantics, Typology/Language Diversity
  • Mathematics (Discrete Mathemathics, Linear Algebra and Calculus, Probability Theory, Mathematical Statistics, Computability and Complexity, Logic, Graphs and Topology)
  • Latin, Latin in modern Linguistics, Ancient Literature

Languages: Russian (native), English (academic), French(A2-B1), Latin (a bit), German (A1)

Personal qualities: responsibility, punctuality, being hard-working, passion for programming, perseverance, resistance to stress

Why is it I am interested in machine translation? Why is it that I am interested in Apertium?

The speed of information circulation does not allow to spend time on human translation. I am truly interested in formal methods and models because they represent the way any language is constructed (as I see it). Despite some exceptions, in general language is very logical and the main problem is how to find proper systematic description. Apertium is a powerful platform that allows to build impressive rule-based engines. Languages like Latin are well-ordered, particularly their morphology, so it makes rule-based translation very promising.

Which of the published tasks am I interested in? What do I plan to do?

I would like to add Latin-Russian language pair. I plan to do my best to reach high results, more details are given in Proposal part.

Proposal

Why Google and Apertium should sponsor it? How and who it will benefit in society?

I think there is a lot of math in language and graph representation of dictionaries is an exciting idea, because it adds some kind of cross-validation and internal system source of information. This information help to fill some lacunae that appear while creating a dictionary. This will improve a quality of translation as we manage to expand bidix.

Coding Challenge

Week by week work plan

Week 0: Preparation

First phase

Week 1:

Week 2:

Week 3:

Week 4:

Second phase

Week 5:

Week 6:

Week 7:

Week 8:

Third phase

Week 9:

Week 10:

Week 11:

Week 12:

Final evaluation

Non-Summer-of-Code plans you have for the Summer

GSoC is the only project I have this summer. I have some exams in the end of June.