User:Supasonk/proposal

From Apertium
Jump to navigation Jump to search

Name : Supason Kotanut
E-mail address : supasonk@gmail.com
Github : https://github.com/supasonk

Why are you interested in machine translation?

I’m interested in machine translation because I think it is really very beneficial for us to deconstruct the grammar for the language pair that we want to translate. And by just the use of the rule-based system, we can translate one language to others. I think that is really amazing. Another thing that I really like about machine translation is it doesn’t really require an internet connection to work. It helps the people that are less privileged to have an internet connection to be able to use it for their education.

Why are you interested in Apertium?

The reason that i am interested in Apertium is because I think it is great to create a machine translation for nearly extinct language. I think that it help the people that want to learn the language to potentially learn it. Because it is hard to learn those languages without native speaker. I think it enables people to learn those languages. And in turn, help preserve those languages from extinction. This project can serve as a stepping stones for other closely-related language pair such as Lao - Phu Thai , Thai - Phu Thai. So it will be beneficial for the people that want to do the machine translation for those language.

Which of the published tasks are you interested in? What do you plan to do?

Adopted an unreleased language pair. I plan to do the Thai - Lao language pair because the grammar of Thai and Lao language are closely-related and In Apertium there hasn’t been language pair that need to be inflected before, So it is very interesting to do those language and see its result. Another reason is the translation of Thai-Laos language in google translate seems like it translate through english language So it is interesting to see that if we used just bilingual dictionary and rule what will be the result. In order to translate between Thai - Lao language it is harder than other language that has space between words , So I think it will be interesting to do it. Because I hear that in Apertium there hasn’t been doing language pair that require cutting before.

Project Proposal: Thai - Laos Machine Translation Language Pair

Student : Supason Kotanut
Mentor : Francis Tyers

This project proposal describes a Machine Translation for Thai - Lao, which will translate the language pair of Thai and Laos. The user can use it to translate between Thai and Laos. Background, Opportunity, and Need: Apertium are machine translation system but as of now they don't include the Thai - Laos language pair yet. It is a great opportunity to implement this language pair.

The Reason why Google and Apertium should sponsored it ?

Because this project will enable the people that want to learn Thai/Laos language to used it as a translation tools or learning tools. It will help people without the internet connection to be able to translate to and from Thai/Lao right away. It will enable people who don't have a reliable internet connection to be able to use it as a translation tools. Because most of the translation tools require an internet connection to use. This will help close the gap of equity between people who has an internet connection and who hasn’t. I think this is an important subject that people are working on that is giving people equal chance to compete with each other.

Benefit

This project will benefit people that want to learn Thai/Laos. By providing them tools for translating between Thai/Laos language without the requirement of internet connection. Also it will provide it free of cost. It will be one of the first language pair that required cutting in Apertium.

Work Plan

Post Application Period :

  • Week 1 : Find the dataset of both language (Thai - Laos)
  • Week 2 : Find the dataset of both language (Thai - Laos)
  • Week 3 : Find the dataset of both language (Thai - Laos)
  • Week 4 : Find the dataset of both language (Thai - Laos)

Community Bonding

  • Week 1 : Find the dataset of both language (Thai - Laos)
  • Week 2 : Find the dataset of both language (Thai - Laos)
  • Week 3 : Mapping the appropriate words from Thai - Laos (semi) automatic
  • Week 4 : Implementing the dictionary of Thai - Laos

Working Period

  • Week 1: Implementing the dictionary of Thai - Laos
  • Week 2: Implementing the part-of-speech rule representing Thai/Laos sentences structure
  • Week 3 : Implementing the part-of-speech rule representing Thai/Laos sentences structure
  • Week 4: Writing documents

Deliverable #1 : Dictionary of Thai - Laos words , Thai/Laos part-of-speech rule

  • Week 5: Evaluate First Deliverable
  • Week 6: Implementing the special part-of-speech rule representing Thai/Laos sentences structure
  • Week 7: Implementing the special case part-of-speech rule representing Thai/Laos sentences structure
  • Week 8: Implementing the special case part-of-speech rule representing Thai/Laos sentences structure / Writing documents

Deliverable #2 : Special part-of-speech rule representing thai sentences structure

  • Week 9: Evaluate Second Deliverable / Implementing transfer rule representing Thai/Laos sentence structure
  • Week 10: Implementing transfer rule representing Thai/Laos sentence structure
  • Week 11: Implementing transfer rule representing Thai/Laos sentence structure
  • Week 12: Writing Documents

Deliverable #3 : Implementing rule representing Laos sentence structure

Goals

  • Usable part-of-speech, special case part-of-speech, transfer rule.
  • Cover more than 70 %.
  • Testcase for further development.

Implementation

Normal machine translation system require 3 set of rules which are dictionary that map the appropriate translation between both language, rule representing regular Thai sentences structure and rule representing regular Laos sentences structure. My plan is to gather the dataset of Laos words and Thai words. After that I will learn about both language grammar and every possible sentences structure.

My Skills

I have been working on a project that used Python, Java, and JavaScript. Also, I have been project manager in a Workgroup Software management course at my university, So I have knowledge about software development planning and confident that this project will be finished on time. Also, I am native Thai speaker, So I have great knowledge about the Thai language.