Difference between revisions of "User:Denis Rakhman/proposal"

From Apertium
Jump to navigation Jump to search
Line 21: Line 21:
I would like to work with Hill Mari, for example with Hill Mari - Russian language pair. But some other tasks (for example, related with Chukchi) are also possible.
I would like to work with Hill Mari, for example with Hill Mari - Russian language pair. But some other tasks (for example, related with Chukchi) are also possible.
<br />
<br />
'''Why should Google and Apertium sponsor it and which social benefits can it bring?'''
<br /> The purpose of this work is to create a mrj-rus transducer. It will be a complete product, which one will be able to use in any purposes.
<br /> Moreover, Hill Mari is one of the official languages of Mari El Republic. That means that, besides some social benefits described above, such a translator can be useful for local schools, libraries etc.
<br />
'''Work plan'''





Revision as of 13:53, 3 April 2017

Contact information

Name: Denis Rakhman
E-mail: drahman2@mail.ru
IRC: Denis_Rakhman
Phone number: 8-968-815-43-81
Location: Moscow

Why am I interested in machine translation?

It is obvious that the machine translation is one of the main areas of the computational linguistics. The usability of a good machine translator can hardly be overrated.
But that's not what excite me in the machine translation.
When I knew nothing about both theoretical and computational linguistics, I never thought about natural languages as about some set of rules. In fact, I did, but in my mind they were invented by a group of very smart people in heavy glasses. It was a shock to me to realize that the linguistic rules are no less strict than the physical ones. I thought: "Wow! Maybe the language can be modelled as an alhorythm?". And than I have been told about NLP and, in particular, about machine translation.
Machine translation is one of a few areas in NLP that deals not only with the particular language structure, but also with language typology. That means an increased (in comparison with other NLP problems) part of linguistic theory in it, which also attracts me.

Why am I interested in Apertium?

The main thing that attracts me in Apertium is its interest in minority language. This area is both very interesting for me and very important for the society. Minority languages are often the endangered ones, and the fact that some language is not only being described by linguists, but also used in machine translation, can encourage its speakers and help to give it a new life.
I am also personally interested in machine translation for minority languages. Firstly, it is machine translation. Secondly, minority languages (for example, Hill Mari) are a very important part of our university and, in particular, my own research activity.
Apertium also has an extremely friendly community, and this fact attracts me even more.

The task

I would like to work with Hill Mari, for example with Hill Mari - Russian language pair. But some other tasks (for example, related with Chukchi) are also possible.
Why should Google and Apertium sponsor it and which social benefits can it bring?
The purpose of this work is to create a mrj-rus transducer. It will be a complete product, which one will be able to use in any purposes.
Moreover, Hill Mari is one of the official languages of Mari El Republic. That means that, besides some social benefits described above, such a translator can be useful for local schools, libraries etc.
Work plan


Skills, knowledge and experience

At this moment I am the 3rd year bachelor student of the Linguistic Department of the NRU HSE, Moscow.
Knowledge:
Programming:

  • python 3
    Linguistics:
  • both functional and formal approaches to the syntax
  • morphology
  • phonetics
  • lexical semantics
  • language typology
    Languages:
  • Russian (native)
  • English (advanced)
  • Italian (intermediate)
  • French (intermediate)
    Skills:
    Programming:
  • python 3, pymorphy2 (a morphological analyser for Russian)
  • HTML, CSS Linguistics:
  • grammar description during the field work, glossing, older grammar descriptions and theories analysis
    Experience:
    Coding:
  • distant verb arguments extraction in case of coordinate clauses
    Linguistics:
  • purpose clauses in Hill Mari (field research)