Difference between revisions of "User:Uliana/gsoc-propuesta"
Line 52: | Line 52: | ||
I am interested in working on an unreleased language pair for Sicilian-Spanish translation. |
I am interested in working on an unreleased language pair for Sicilian-Spanish translation. |
||
− | As my coding challenge I created a new language package scn-spa, added basic vocabulary to the dictionary of Sicilian and translations into Sicilian-Spanisch dictionary. |
+ | As my coding challenge I created a new language package scn-spa, added basic vocabulary to the dictionary of Sicilian and translations into Sicilian-Spanisch dictionary. I am also currently working on |
I also started to conduct research in the structure of Sicilian language: I have got into touch with contributors of Wikipedia in Sicilian language and thanks to ''spectei'' I also have reached computational linguist who studies in Munich and is native speaker of Sicilian. |
I also started to conduct research in the structure of Sicilian language: I have got into touch with contributors of Wikipedia in Sicilian language and thanks to ''spectei'' I also have reached computational linguist who studies in Munich and is native speaker of Sicilian. |
||
== Proposal and work plan == |
== Proposal and work plan == |
||
+ | |||
+ | |||
+ | <center> |
||
+ | {|class="wikitable" |
||
+ | ! Period !! Week !! Description !! Commenta |
||
+ | |- |
||
+ | | rowspan="10" | Pre-work Period || 09:00—09:30 ||rowspan="5"| [[Helsinki Apertium Workshop/Session 0|0: Overview]] || '''Getting to know each other''' |
||
+ | |- |
||
+ | || 09:30—10:30 || '''General introduction''': [https://svn.code.sf.net/p/apertium/svn/branches/courses/helsinki_2013/slides/session0.pdf Machine translation] |
||
+ | |- |
||
+ | || 10:30—11:00 ||align="center"| '''Coffee break''' |
||
+ | |- |
||
+ | || 11:00—12:00 || '''Introduction''': [https://svn.code.sf.net/p/apertium/svn/branches/courses/helsinki_2013/slides/session0a.pdf The Apertium machine-translation platform] |
||
+ | |- |
||
+ | || 12:00—13:00 || '''Practical''': Installing Apertium and creating a language pair |
||
+ | |||
+ | |- |
||
+ | || 13:00—14:00 ||colspan="2" align="center"| '''Lunch''' |
||
+ | |||
+ | |- |
||
+ | || 14:00—14:30 ||rowspan="4"| [[Helsinki Apertium Workshop/Session 1|1: Basic dictionaries]] || '''Theory''': [https://svn.code.sf.net/p/apertium/svn/branches/courses/helsinki_2013/slides/session1.pdf Morphology and morphotactics] |
||
+ | |- |
||
+ | || 14:30—15:00 ||align="center"| '''Coffee break''' |
||
+ | |- |
||
+ | || 15:00—17:00 || '''Practical''': Paradigms and continuation lexica |
||
+ | |||
+ | |- |
||
+ | ! style="background:darkgray" colspan="4" | |
||
+ | |- |
||
+ | | rowspan="9" | First th Month || 09:00—10:00 ||rowspan="4"| [[Helsinki Apertium Workshop/Session 2|2: Advanced dictionaries]] || '''Practical''': Creating dictionaries |
||
+ | |- |
||
+ | || 10:00—11:30 || '''Theory''': [https://svn.code.sf.net/p/apertium/svn/branches/courses/helsinki_2013/slides/session2b.pdf Morphophonology] |
||
+ | |- |
||
+ | || 11:30—12:00 ||align="center"| '''Coffee break''' |
||
+ | |- |
||
+ | || 12:00—13:00 || '''Practical''': Working on morphology |
||
+ | |- |
||
+ | || 13:00—14:00 ||colspan="2" align="center"| '''Lunch''' |
||
+ | |- |
||
+ | || 14:00—14:30 ||rowspan="4"| [[Helsinki Apertium Workshop/Session 3|3: Morphological disambiguation]] || '''Theory''': [https://svn.code.sf.net/p/apertium/svn/branches/courses/helsinki_2013/slides/session3.pdf Morphological and syntactic disambiguation] |
||
+ | |- |
||
+ | || 14:30—15:00 ||align="center"| '''Coffee break''' |
||
+ | |- |
||
+ | || 15:00—17:00 || '''Practical''': Writing rules for morphological disambiguation |
||
+ | |- |
||
+ | ! style="background:darkgray" colspan="4" | |
||
+ | |- |
||
+ | | rowspan="11" | Secondth Month || 09:00—09:30 ||rowspan="6"| [[Helsinki Apertium Workshop/Session 4|4: Lexical transfer]] || '''Practical''': Dictionary work |
||
+ | |- |
||
+ | || 09:30—10:00 || '''Theory''': [https://svn.code.sf.net/p/apertium/svn/branches/courses/helsinki_2013/slides/session4.pdf Lexical transfer] |
||
+ | |- |
||
+ | || 10:00—11:00 || '''Practical''': Work on bilingual dictionaries |
||
+ | |- |
||
+ | || 11:00—11:30 ||align="center"| '''Coffee break''' |
||
+ | |- |
||
+ | || 11:30—12:00 || '''Theory''': [https://svn.code.sf.net/p/apertium/svn/branches/courses/helsinki_2013/slides/session4.pdf Lexical selection] |
||
+ | |- |
||
+ | || 12:00—13:00 || '''Practical''': Working on lexical selection |
||
+ | |- |
||
+ | || 13:00—14:00 ||colspan="2" align="center"| '''Lunch''' |
||
+ | |- |
||
+ | || 14:00—14:30 ||rowspan="4"| [[Helsinki Apertium Workshop/Session 5|5: Structural transfer]] || '''Theory''': [https://svn.code.sf.net/p/apertium/svn/branches/courses/helsinki_2013/slides/session5.pdf Basic structural transfer] |
||
+ | |- |
||
+ | || 14:30—15:00 ||align="center"| '''Coffee break''' |
||
+ | |- |
||
+ | || 15:00—17:00 || '''Practical''': Writing rules for structural transfer |
||
+ | |||
+ | |- |
||
+ | ! style="background:darkgray" colspan="4" | |
||
+ | |- |
||
+ | |- |
||
+ | | rowspan="8" | Third Month || 09:00—9:30 ||rowspan="4"| [[Helsinki Apertium Workshop/Session 6|6: Multi-level structural transfer]] || '''Теория''': [https://svn.code.sf.net/p/apertium/svn/branches/courses/helsinki_2013/slides/session6.pdf Multi-level structural transfer] |
||
+ | |- |
||
+ | || 09:30—11:00 || '''Practical''': Writing transfer rules |
||
+ | |- |
||
+ | || 11:00—11:30 ||align="center"| '''Coffee break''' |
||
+ | |- |
||
+ | || 11:30—13:00 || '''Practical''': Writing transfer rules |
||
+ | |- |
||
+ | || 13:00—14:00 ||colspan="2" align="center"| '''Lunch''' |
||
+ | |- |
||
+ | || 14:00—15:00 ||rowspan="4"| [[Helsinki Apertium Workshop/Session 6|6: Multi-level structural transfer]] || '''Discussion''': Uralic comparative grammar |
||
+ | |||
+ | |- |
||
+ | || 15:00—15:30 ||align="center"| '''Coffee break''' |
||
+ | |- |
||
+ | || 15:30—17:00 || '''Practical''': Writing transfer rules |
||
+ | |- |
||
+ | ! style="background:darkgray" colspan="4" | |
||
+ | |- |
||
+ | |- |
||
+ | |||
+ | | rowspan="8" | 17th May || 09:00—09:30 ||rowspan="4"| [[Helsinki Apertium Workshop/Session 7|7: Data consistency, quality and evaluation]] || '''Theory''': [https://svn.code.sf.net/p/apertium/svn/branches/courses/helsinki_2013/slides/session7a.pdf Data consistency, quality] and [https://svn.code.sf.net/p/apertium/svn/branches/courses/helsinki_2013/slides/session7b.pdf evaluation] |
||
+ | |- |
||
+ | || 09:30—11:00 || '''Practical''': Finding and fixing errors |
||
+ | |- |
||
+ | || 11:00—11:30 ||align="center"| '''Coffee break''' |
||
+ | |- |
||
+ | || 11:30—13:00 || '''Practical''': Finding and fixing errors |
||
+ | |- |
||
+ | || 13:00—14:00 || colspan="2" align="center"| '''Lunch''' |
||
+ | |- |
||
+ | || 14:00—14:30 ||rowspan="3"| [[Helsinki Apertium Workshop/Session 8|8: Project planning, questions and answers]] || '''Theory''': Project planning, questions and answers |
||
+ | |- |
||
+ | || 14:30—15:00 || '''Practical''': Finding and fixing errors |
||
+ | |- |
||
+ | || 15:00—17:00 || '''Conclusion''': Round table on making machine translation systems |
||
+ | |- |
||
+ | ! style="background:darkgray" colspan="4" | |
||
+ | |- |
||
+ | |} |
||
+ | |||
+ | |||
+ | |||
+ | </center> |
Revision as of 18:16, 17 March 2016
Contents
Contacts
Uliana Sentsova
E-mail: uliana.sentsova@gmail.com
Number: +7 (916) 774-95-30
Skype: ulyanasidorova
IRC channel: uliana at #apertium
Education and achievements
Lomonosov Moscow State University
Qualification: Bachelor in Linguistics (romance-german languages)
GPA: 10.0 / 10.0
National Research University „Higher School of Economics“
Qualification: Major in Natural Language Processing
Current GPA: 8.5 / 10.0
2015: Awardee of graduates’ competition „Natural Language Processing” (a competition for students hold by National Research University Higher School of Economics)
2014: Scholarship of Academic Council of MSU for scientific activities (a special award for top 10% students with academic excellence and scientific activity)
2013: Enhanced State Academic Scholarship for scientific activities (is awarded on the basis of academic excellence and scientific achievements)
Projects
„Building Open Source Information Extraction System for Russian Language”
Organisation: National Research University „Higher School of Economics”
Project roles: project manager, software developer (Python)
Description: Creating a hybrid information extraction system using rule-based approach and machine learning technologies. This system is able to extract named entities (persons, locations and organizations) and will become a part of stack technology for NLP developed by National Research University „Higher School of Economics”. At this moment in time the system has 93% precision (evaluated by Dialogue Evaluation Conference on 37 000 annotated texts).
My interest in Machine Translation
My interest in Apertium projects
I am interested in working on an unreleased language pair for Sicilian-Spanish translation.
As my coding challenge I created a new language package scn-spa, added basic vocabulary to the dictionary of Sicilian and translations into Sicilian-Spanisch dictionary. I am also currently working on
I also started to conduct research in the structure of Sicilian language: I have got into touch with contributors of Wikipedia in Sicilian language and thanks to spectei I also have reached computational linguist who studies in Munich and is native speaker of Sicilian.
Proposal and work plan
Period | Week | Description | Commenta |
---|---|---|---|
Pre-work Period | 09:00—09:30 | 0: Overview | Getting to know each other |
09:30—10:30 | General introduction: Machine translation | ||
10:30—11:00 | Coffee break | ||
11:00—12:00 | Introduction: The Apertium machine-translation platform | ||
12:00—13:00 | Practical: Installing Apertium and creating a language pair | ||
13:00—14:00 | Lunch | ||
14:00—14:30 | 1: Basic dictionaries | Theory: Morphology and morphotactics | |
14:30—15:00 | Coffee break | ||
15:00—17:00 | Practical: Paradigms and continuation lexica | ||
First th Month | 09:00—10:00 | 2: Advanced dictionaries | Practical: Creating dictionaries |
10:00—11:30 | Theory: Morphophonology | ||
11:30—12:00 | Coffee break | ||
12:00—13:00 | Practical: Working on morphology | ||
13:00—14:00 | Lunch | ||
14:00—14:30 | 3: Morphological disambiguation | Theory: Morphological and syntactic disambiguation | |
14:30—15:00 | Coffee break | ||
15:00—17:00 | Practical: Writing rules for morphological disambiguation | ||
Secondth Month | 09:00—09:30 | 4: Lexical transfer | Practical: Dictionary work |
09:30—10:00 | Theory: Lexical transfer | ||
10:00—11:00 | Practical: Work on bilingual dictionaries | ||
11:00—11:30 | Coffee break | ||
11:30—12:00 | Theory: Lexical selection | ||
12:00—13:00 | Practical: Working on lexical selection | ||
13:00—14:00 | Lunch | ||
14:00—14:30 | 5: Structural transfer | Theory: Basic structural transfer | |
14:30—15:00 | Coffee break | ||
15:00—17:00 | Practical: Writing rules for structural transfer | ||
Third Month | 09:00—9:30 | 6: Multi-level structural transfer | Теория: Multi-level structural transfer |
09:30—11:00 | Practical: Writing transfer rules | ||
11:00—11:30 | Coffee break | ||
11:30—13:00 | Practical: Writing transfer rules | ||
13:00—14:00 | Lunch | ||
14:00—15:00 | 6: Multi-level structural transfer | Discussion: Uralic comparative grammar | |
15:00—15:30 | Coffee break | ||
15:30—17:00 | Practical: Writing transfer rules | ||
17th May | 09:00—09:30 | 7: Data consistency, quality and evaluation | Theory: Data consistency, quality and evaluation |
09:30—11:00 | Practical: Finding and fixing errors | ||
11:00—11:30 | Coffee break | ||
11:30—13:00 | Practical: Finding and fixing errors | ||
13:00—14:00 | Lunch | ||
14:00—14:30 | 8: Project planning, questions and answers | Theory: Project planning, questions and answers | |
14:30—15:00 | Practical: Finding and fixing errors | ||
15:00—17:00 | Conclusion: Round table on making machine translation systems | ||