Difference between revisions of "User:Ljmocic/GSoC 2016 proposal"
(One intermediate revision by one other user not shown) | |||
Line 87: | Line 87: | ||
Education: I am on the 2th year of Bachelor’s degree in Computer Science and Engineering at the Faculty of Technical Sciences in University of Novi Sad. |
Education: I am on the 2th year of Bachelor’s degree in Computer Science and Engineering at the Faculty of Technical Sciences in University of Novi Sad. |
||
Languages: Can’t say that Serbian or Russian is my native language, because i speak both as long as i remember. I know Croatian, Bosnian, Montenegrin languages on a good level, mainly due their similarity. Beside these slavic languages, i have learned b1 level of german language. |
Languages: Can’t say that Serbian or Russian is my native language, because i speak both as long as i remember. I know Croatian, Bosnian, Montenegrin languages on a good level, mainly due their similarity. Beside these slavic languages, i have learned b1 level of german language. |
||
Open Source: |
Open Source: |
||
I have experience on working on this language pair while I was on Google-Code-In. |
I have experience on working on this language pair while I was on Google-Code-In. |
||
Contributed to Apertium, Amarok, Opensuse, SurveyMonkey and SymPy. |
Contributed to Apertium, Amarok, Opensuse, SurveyMonkey and SymPy. |
||
Programming languages: |
Programming languages: |
||
Most used: Python, C/C++. |
Most used: Python, C/C++. |
||
Experience through projects: Bash, HTML, CSS, XML, Matlab/Octave. |
Experience through projects: Bash, HTML, CSS, XML, Matlab/Octave. |
||
⚫ | |||
⚫ | |||
== List any non-Summer-of-Code plans you have for the Summer == |
== List any non-Summer-of-Code plans you have for the Summer == |
Latest revision as of 01:50, 20 March 2016
Contents
- 1 Contact information
- 2 Why is it you are interested in machine translation?
- 3 Why is it that you are interested in the Apertium project?
- 4 Which of the published tasks are you interested in? What do you plan to do?
- 5 Reasons why Google and Apertium should sponsor it.
- 6 How and who it will benefit in society.
- 7 Work plan
- 8 List your skills and give evidence of your qualifications.
- 9 List any non-Summer-of-Code plans you have for the Summer
Contact information[edit]
Name: Ljubiša Moćić
E-mail address: ljubisa.mocic[at]gmail.com
IRC: ljmocic
SourceForge: lmocic
Why is it you are interested in machine translation?[edit]
I am interested in machine translation because I found out many applications of machine translation very useful to my community and me. First one is removing language barriers which is one of the best things that machine translation can provide. Also machine translation is very complex area, development is not easy and it requires a lot of time spent developing, updating and refining. So it is clear that it is challenging, but that makes it so interesting. My interest in machine translation began developing after I found out a lot of ways to make it even more useful when it combines with artificial intelligence, robotics and natural language processing.
Why is it that you are interested in the Apertium project?[edit]
I am interested in the Apertium project for many reasons. First one is the accuracy. While many projects try to create very accurate machine translations, most fail at this job. But Apertium takes advantage, mainly because of focusing on quality over quantity. Even if complexity “Under the hood” is high, it delivers quality translation. Of course, open-source is the one of the main reasons why Apertium is amazing. I have plans for using Apertium for my research in future, so it would be very useful to extend Apertium library of language pairs. Also, I’ve worked with Apertium on Google-Code-in and I liked team and atmosphere.
Which of the published tasks are you interested in? What do you plan to do?[edit]
Title[edit]
Adopt an unreleased language pair.
Reasons why Google and Apertium should sponsor it.[edit]
It should be sponsored because there is no existing high quality machine translation tool for Serbo-Croatian to Russian language, and this language pair would create it with help of Apertium and Google. This language pair would be useful by wide public community of Serbia, Croatia, Russia( also Montenegro, Bosnia and Herzegovina because of similarity between languages).
How and who it will benefit in society.[edit]
Besides benefiting the ones who are learning sh-ru in one direction or the other, I would introduce my professors and colleagues to this project. Particularly, it would benefit those students who wish to delve deeper into the subject of Machine Translation. Also 170 million people(Russia, Croatia, Serbia, Bosnia, Montenegro). Possible useful documentation for future development.
Work plan[edit]
Before the commencement of coding period of GSoC, I will be focused on: - Connecting with community. - Exploring and understanding of Apertium developing environment. - Researching more about machine learning - Enhancing knowledge related to hbs-rus language pair.
Week 1:
- Finish coding challenge, run testvoc
Week 2:
- Write lexical selection rules, write transfer rules
Week 3:
- Adding more nouns, verbs, pronouns
Week 4:
- Adding more nouns, adjectives
Deliverable #1: Extended dictionary, added/improved lexical/transfer rules.
Week 5:
- Adding more adverbs, verbs
Week 6:
- Continue extending hbs-rus bilingual dictionary
Week 7:
- Add/improve transfer rules, extend word coverage
Week 8:
- Cleaning up, run testvoc
Deliverable #2:: Extended dictionary to trunk level, higher level word coverage.
Week 9:
- Add/adjust rules as necessary, extend word coverage
Week 10:
- Perform thorough testings
Week 11:
- Writing wiki pages
Week 12:
- Cleaning up, last minute fixes.
Deliverable #3: language pair(release quality) and documentation.
List your skills and give evidence of your qualifications.[edit]
Education: I am on the 2th year of Bachelor’s degree in Computer Science and Engineering at the Faculty of Technical Sciences in University of Novi Sad.
Languages: Can’t say that Serbian or Russian is my native language, because i speak both as long as i remember. I know Croatian, Bosnian, Montenegrin languages on a good level, mainly due their similarity. Beside these slavic languages, i have learned b1 level of german language.
Open Source: I have experience on working on this language pair while I was on Google-Code-In. Contributed to Apertium, Amarok, Opensuse, SurveyMonkey and SymPy.
Programming languages: Most used: Python, C/C++.
Experience through projects: Bash, HTML, CSS, XML, Matlab/Octave.
Basic familiarity: Java, JavaScript, Assembler, VHDL.
List any non-Summer-of-Code plans you have for the Summer[edit]
Exams at my faculty are scheduled to take place from June 12th to July 15th and in that period I will probably be forced to spend less time working on project, but I will compensate these hours. I plan to work at least 30 hours per week on average.