From Apertium
< User:Giannisk
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Contact information

  • Name: Giannis Konstantinidis
  • E-mail: giankonstantinidis(at)gmail(dot)com
  • IRC: giannisk on
  • SourceForge: giankon

Why is it you are interested in machine translation?

I can say that I've worked a lot with translations; Since 2009, I've contributed to various f/oss projects localising their software into the Greek language. I think that localisations are very important, because they enable non-English speaking people to use software in their own languages.

But hey, we just let them use software in their own language, what happens with content on the internet not in their language for example? Here comes machine translation, which is able to fulfill that need to some extent. So, this made me become interested on this particular field as well.

Why is it that you are interested in the Apertium project?

Apertium falls under the Free/Open Source Software category. And I deeply appreciate that. There are a lot of proprietary machine translation platforms at the time being. It's important to promote a free/open source alternative capable to do the same work; apertium is one of those.

I've previously contributed to Apertium to some extent, as I had completed a few tasks during Google Code-In 2011 in an attempt to grow the el-en dictionaries. During that GCI I was ~16yrs. old and a high school student and back then I was thinking that I could work on completing the Mordern Greek-English language pair. However, during the school year 2012-2013 (until May 2013 in particular) I was focusing on my studies as I had to prepare myself for the Panhellenic Examinations; getting good grades during those exams is mandatory in order to access highest education (universities) in Greece. Eventually, I've completed my goal and now that I'm over 18 and enrolled in a university I can dedicate some of my time; and I'm also eligible to participate in GSoC... so it's a good combination and I guess the time has finally come to make this idea come true :)

Which of the published tasks are you interested in? What do you plan to do?

Adopting the Modern Greek->English language pair.

There's already some work done for the el-en pair, but I guess there's a LOT of work that needs to be done still. My plan is to extend and complete the el-en pair with an as low WER as possible.

Some public institutions here in Greece have some dictionaries which are freely available (public domain) and I'm thinking of writing a simple script that could extract words from these dictionaries and fill the apertium-el-en ones. There should be some related dictionaries available free (as in freedom of course) on the internet that can also be used, though I haven't made a research yet.

Reasons why Google and Apertium should sponsor it

I would like to enable the Greek speaking community around the world to be able to use a free/open source platform in order to make text translations. And that should also mean that anyone will be able to adapt/improve my work later in order to fulfill their needs. Currently there are some platforms offering machine translation from Greek to English and vice-versa, but we need to have a good f/oss alternative.

Work plan

Here's a brief schedule, subject to change if necessary.

Community Bonding Period:

  • Get closer with Apertium tools and community. Go through all available documentation on wikis, etc.
  • Resume working on the coding challenge text; try to cover the majority of it with a low WER
  • Perform a cleanup on current el-en dictionaries
  • Search and find freely available Greek-English dictionaries which can be used

Work Period:

  • Week 1:
    • Write a qucik script to import stuff from other dictionaries
    • Start adding nouns
  • Week 2:
    • Start adding adjectives and adverbs
  • Week 3:
    • Start adding verbs
  • Week 4:
    • Finish adding verbs, work on pronouns, etc.
  • Week 5:
    • Extend the Greek-English bilingual dictionary
  • Week 6:
    • Add/improve transfer rules
    • Cleanup code
    • Run testvoc
  • Week 7:
    • Add/adjust rules as necessary
    • Start extending word coverage
  • Week 8:
    • Run testvoc
    • Add/adjust rules as necessary
    • Keep on extending word coverage
  • Week 9:
    • Add/adjust rules as necessary
    • Keep on extending word coverage
  • Week 10:
    • Perform thorough testings
    • Add/adjust rules as necessary
    • Keep on extending word coverage
  • Week 11:
    • Final efforts on extending dictionaries; perform tests
  • Week 12
    • Start cleaning up // document stuff
  • Week 13:
    • Deliver a working pair with the lowest possible WER

List your skills and give evidence of your qualifications

I'm an undergraduate student at the Dept. of Information & Communication Systems Enginnering of the University of the Aegean, in Greece.

I stand for Free/Open Source Software. I've been contributing to various projects in localisations and outreach.

I'm a native Greek speaker and fluent in English. Also, I can speak some basic German.

So far, I've worked a lot with C, HTML/CSS/JavaScript, XML. Also some basic C++ and Python for the time being.

List any non-Summer-of-Code plans you have for the Summer

During July and August, I might be working part-time as a Computer&Network Technician for a few hours per day. However I'm going to spend at least 35 hours per week on apertium (that equals to at least 5 hours daily). I'm planning to dedicate a lot of my time for GSoC/apertium to produce a good output, so no worries. :)