Difference between revisions of "User:Ilnar.salimzyan/GSoC2014/Application"
Jump to navigation
Jump to search
(32 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
You can find my proposal for GSoC 2014 [http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/selimcan/5649050225344512 here]. |
|||
Remember that this is only a preview :) |
|||
[[Category:GSoC_2014_Student_proposals|Ilnar.salimzyan]] |
|||
== GSoC application: Apertium-kaz-tat: machine translation between Kazakh and Tatar == |
|||
'''Name:''' Ilnar Salimzyanov |
|||
'''E-mail adress:''' ilnar.salimzyan@gmail.com |
|||
''Other information that may be useful to contact you:'' |
|||
'''IRC:''' selimcan '''Sourceforge account:''' selimcan '''Cellphone:''' +79625617985 '''Timezone:''' UTC+04.00 |
|||
=Why is it you are interested in machine translation?= |
|||
=Why is it that you are interested in the Apertium project?= |
|||
=Which of the published tasks are you interested in? What do you plan to do?= |
|||
'''Task:''' |
|||
''Adopting a language pair'' |
|||
'''Title:''' |
|||
''Apertium-kaz-tat — machine translation between Kazakh and Tatar'' |
|||
==Why should Google and Apertium sponsor it?== |
|||
==How and whom it will benefit in society?== |
|||
=Work plan= |
|||
=Work To do= |
|||
==Before the coding period:== |
|||
==The coding period:== |
|||
==Non-GSoC activities== |
|||
==List your skills and give evidence of your qualifications== |
|||
I am the first year master’s student at the Kazan Federal University, studying Applied Linguistics <ref>A not-so-clear term, which caused many debates. What we study is a mix of computational linguistics, lexicography and several other courses.</ref> |
|||
I got to know about Apertium first time in 2009, while writing a small paper at the university on comparison of available machine translation systems. Apertium fascinated me then being open source, showing rapid growth and being a good potential starting point for Tatar and other Turkic languages (yes, I have thought about them too). I played around with lttoolbox dictionary for Tatar (bad idea, I know, but I didn’t know about "X/S/HFST"s then and there weren’t any other Turkic languages involved). I even managed to model nouns morphotactics using it! |
|||
Back in 2009 I translated part of the Official Documentation into Russian <ref> See /apertium/trunk/apertium-documentation/apertium-2.0/ru/apertium_docu.odt</ref> (till chapter 3.2.3; besides someone willing to finish it the translation needs a good editor). Also in 2009 I translated Apertium New language pair Howto into Russian. |
|||
I was one of the participants of the Šupaškar Apertium Workshop, held in January this year, where Francis Tyers, Hector Alos-i-Font, |
|||
Jonathan Washington and Trond Trosterud were instructors. |
|||
I was very fortunate to see Jonathan and Francis work on Tatar-Bashkir pair as an example pair for the Šupaškar Workshop and move it to nursery. It is very useful to have a transducer for my native language (and a language closest to it) to learn the semantics and structure of lexc and twol files (which I wasn’t really familiar with, since using HFST with Apertium is relatively new thing and it is not mentioned in the Official Documentation), along with the reading the famous FSMBook. |
|||
I have been involved in work on Tatar-Bashkir pair as, let’s say, “language-consultant” and “tester”<ref>See accepted, but not-yet-published paper here: https://www.softconf.com/lrec2012/TurkicLanguage2012/cgi-bin/scmd.cgi?scmd=getFinal&passcode=18X-P9A6A3D6H8&_lDoc=Paper</ref>. With another fellow from Ufa we have been translating top-5000 wordlist of Russian National Corpus into Tatar and Bashkir. This translations were added then to the translator files. Also, I have been analyzing some errors in the translations finding out, where Apertium-tt-ba performed not so well, describing it on the wiki <ref>The Morphology of Tatar Language</ref> and commiting from time to time to svn. |
|||
==References== |
|||
<references/> |
|||
[[Category:GSoC 2012 Student Proposals]] |
Latest revision as of 13:17, 14 May 2014
You can find my proposal for GSoC 2014 here.