User:Deltamachine

From Apertium
Revision as of 13:10, 9 March 2017 by Deltamachine (talk | contribs)
Jump to navigation Jump to search

Contact information

Name: Anna Kondratjeva

Location: Moscow, Russia

E-mail: an-an-kondratjeva@yandex.ru

Phone number: +79250374221

VK: http://vk.com/anya_archer

Github: http://github.com/deltamachine

IRC: deltamachine

Timezone: UTC+3

Skills and experience

Education: Bachelor's Degree in Fundamental and Computational Linguistics (2015 - expected 2019), National Research University «Higher School of Economics» (NRU HSE)

Main university courses:

  • Programming (Python)
  • Computer Tools for Linguistic Research
  • Theory of Language (Phonetics, Morphology, Syntax, Semantics)
  • Language Diversity and Typology
  • Introduction to Data Analysis
  • Math (Discrete Math, Linear Algebra and Calculus, Probability Theory and Mathematical Statistics)

Technical scills: Python (experienced, 1.5 years), HTML, CSS, Flask, Django, SQLite (familiar)

Projects and experience: http://github.com/deltamachine

Languages: Russian, English, German


Why is it you are interested in machine translation?

Why is it that you are interested in Apertium?

Which of the published tasks are you interested in? What do you plan to do?

Reasons why Google and Apertium should sponsor it

A description of how and who it will benefit in society

Work plan

Post application period

Community bonding period

Work period

Week 1: Week 2: Week 3: Week 4: Deliverable #1 Week 5: Week 6: Week 7: Week 8: Deliverable #2 Week 9: Week 10: Week 11: Week 12: Project completed


Non-Summer-of-Code plans you have for the Summer

Coding challenge

https://github.com/deltamachine/wannabe_hackerman

  • apertium_challenge1: Write a script that takes a dependency treebank in UD format and "flattens" it, that is, applies the following transformations:
    • Words with the @conj relation take the label of their head
    • Words with the @parataxis relation take the label of their head

  • apertium_challenge2: Write a script that takes a sentence in Apertium stream format and for each surface form applies the most frequent label from the labelled corpus.