User:Zfe/Application

From Apertium
< User:Zfe
Revision as of 23:51, 4 April 2011 by Zfe (talk | contribs) (Created page with '=Who!?= '''Name:''' Gianluca Grossi '''email:''' me@ggrossi.com '''irc:''' zfe @ freenode '''other contacts:''' skype: giagrossi =Why is it you are interested in machine transl…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Who!?

Name: Gianluca Grossi email: me@ggrossi.com irc: zfe @ freenode other contacts: skype: giagrossi

Why is it you are interested in machine translation?

I've met Apertium project the first time @ freenode, being a regular on #linguistics. Even though I'm a Law student I've always had interest in programming and linguistics. I had a solid education in linguistics, especially at high school, where I've been taught both Latin and Ancient Greek, and since then I kept a strong interest in linguistics related matters. I'm interested in machine translation because I am a huge fan of automation, especially when it comes at typically-human tasks, like translation. Writing rules, morphological analyzers and transfer rules is a really challenging process from my mind and it keeps me from getting bored.

Why is it that they are interested in the Apertium project?

As I said before, I'm really interested in programming and linguistics. Apertium provides me with a free software, the chance to develop something that could be used, rewritten, modified by anybody for any kind of purpose, which makes me even more enthusiastic about the the possibility of creating a new language pair, since my work will be probably reused by somebody else for really different purposes. Apertium lacks of a Turkic language to Turkic language pair and being a huge fan of Turkic languages I think that it would be worth trying. Given the similarity of the languages I'd like to use, Apertium is the right environment to create such a language pair. In addition to that, I had the chance to interact with Apertium community members and it is an environment I really like, they look both knowledgeable and willing to help new members like me.

Which of the published tasks are you interested in? What do you plan to do?

Apertium-tr-az: machine translation between Turkish and Azerbaijani (a savage journey to the heart of the Turanist dream).

Why should Google and Apertium sponsor it? How and who will it benefit in society?

Apertium doesn't have any turkic-pair on release quality level. Turkish is the most widely spoken turkic language, with 80M speakers. On the other hand Azerbaijani has some 20M speakers, 8M if we consider just the Northern variant, which is the official language of Azerbaijan, with 12M people living mostly in Iran without having their language recognized as official language of the country where they live. For this reason, if compared to Turkish ones, there are few resources available in Azerbaijani, especially when it comes about educational tools. Aiming at a good result, I think it would be possible (and useful) to provide Azerbaijani native speakers with the chance to have resources in Turkish automatically translated in their native language. In addition to that it will be necessary for my project to develop a morph analyzer for Azerbaijani, which could be reused in future for other language pairs involving Azerbaijani.


Work Plan

What needs to be done

  1. We need a morph analyzer for Azerbaijani: I'm already working on it. It is called azmorph, it can be downloaded from Apertium SVN. I'm developing it starting from TRmorph, because working this way I can develop it much faster. At the moment it has a working vowel harmony, it can conjugate present aorist (which is the only present tense Azerbaijani has) affirmative and negative, it can handle cases and other noun inflections like plural suffixes, comitative/instrumental. You can easily check what azmorph is already able to do running the script you will find in the azmorph directory.
  2. We need a bidix dictionary: