User:Sereni

From Apertium
Revision as of 10:35, 19 March 2014 by Sereni (talk | contribs) (Created page with "This proposal is a draft. Any feedback will be appreciated. == Personal info == Name: Ekaterina Ageeva Email: sereni.nm@gmail.com IRC: Sereni == Why is it you are interes...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This proposal is a draft. Any feedback will be appreciated.

Personal info

Name: Ekaterina Ageeva

Email: sereni.nm@gmail.com

IRC: Sereni


Why is it you are interested in machine translation?

I would like to work in computational linguistics, so I view machine translation as one of the possible areas. MT appeals to me because it is challenging for developers and useful for everyone. It also speaks to the idea of free and accessible information, with texts in one language instantly understandable for speakers of others. I believe MT can contribute to multicultural understanding, which is something I value.


Why is it that you are interested in the Apertium project?

Apertium does work related to linguistics, which means I am interested and also understand a few things in the area. I participated in several linguistics and programming-related projects, but this is an opportunity to write a complete piece of software that will go on into the larger system and be useful. As to why this particular project, I believe it matches my skills, and I have a fairly good idea of what should be done. I think it could be a good place to start, because I am interested in gaining skills and experience in more complex projects in Apertium after this summer.

My professional interests in this project are to (1) practice writing quality code, (2) learn to make a finished piece of software that integrates into a larger system, (3) practice conducting linguistic experiments, (4) (possibly) write a follow-up paper.

My personal interests include (1) becoming a part of the open-source community, (2) contributing to an open-source project (Apetium in particular, because it matches my area of expertise), (3) spending summer break on an activity that benefits more people than just me.


Proposal

I am interested in making the toolkit for gisting evaluation of Apertium language pairs. As stated in [1], the main purpose of online machine translation is gisting, that is, users try to understand the main sense of the text as opposed to getting an editable translation. It would be useful to have a tool for evaluating how well Apertium language pairs do in such contexts. Such evaluation would point out the pairs ready for release, thus increasing language cover, and it would also provide a quantitative scale for quality measurement. Since evaluation is human-based, it is necessary to develop a framework which will allow to objectively compare each individual evaluation. I propose to create a toolkit that, given a language pair and parallel texts in these languages, generates tests for human evaluators, checks their answers and calculates success rate based on the kind of information provided to users. The system will include text and web-based interfaces. It will also feature different ways of testing based on the form of questions and amount of information provided to users.

Amount of information:

• Original sentence + reference sentence (used for baseline score)

• Original sentence, reference sentence + machine translation (for evaluation)

Types of questions:

• Simple gaps (an omitted keyword)

• Gaps with multiple choice (users are provided with words to choose from)

• Gaps with lemmas (a keyword lemma is shown, the user is required to enter the correct grammatical form)

Different types of questions will require different keyword selection techniques. For simple gaps, we determine keywords by co-occurrence (as in [2], for example) and part of speech tags. For multiple choice, words for choices are extracted from the same text by grammar tags and also by length, as described in [3]. For lemmas, the algorithm is yet to be discussed, with a reservation that verbs rather than nouns will be removed in this case.

In order to test the toolkit, I propose to run evaluations on several language pairs, possibly the ones being developed or improved as a part of GSoC project.

Work plan

Pre-work period (1-21 April)

Familiarize myself with Apertium. Get accustomed to working in Unix to ensure seamless workflow. Explore the existing works on gisting evaluation.

Community bonding period (21 April – 19 May)

Create a literary review of the articles read. Discuss keyword selection for gaps with lemmas; make draft of selection methods. Learn how to integrate Apertium with Python applications. Discuss interface features with mentors.

Work period

NB: work periods include writing documentation on wiki as I go.

Week 1. Create an algorithm for keyword extraction in simple gaps. Test it on Russian and English data by comparing to results obtained using corpora and tf-idf. Write base code that creates sets of {orginal sentence, machine translation, reference translation with gaps, answer key} from text files.

Week 2. Create a method to determine significant grammatical features in different languages for multiple choice gaps. Create an algorithm that selects words for multiple choice gaps. Test in on Russian and English. Possibly find a speaker of non-Indoeuropean language for testing.

Week 3. Design rules for gaps with lemmas. Update code to create multiple choice gaps and gaps with lemmas.

Week 4. Write test cases and module tests. Test the toolkit.

Deliverable 1: A program in Python that creates three varieties of tests given parallel texts and a language pair.

Week 5. Develop the text-based interface: create text files with tasks, extract answers from returned text files.

Week 6. Get familiar with command line interface creation. Develop the command line interface to wrap text generation.

Weeks 7-8. Develop the web-based interface. It will include a landing page for evaluators with choice of language pair and testing method (this can also be randomly assigned), admin for managing the database and a pretty (public?) stats page. Host it on the web.

Week 9. Test the interfaces for technical errors. Fix bugs. Collect feedback on usability and improve it.

Deliverable 2: Text-based interface with command line wrapper and web-based interface for the toolkit.

Week 10. User acceptance testing: perform gisting evaluation of 2-3 existing language pairs. Find texts and informants. Ensure the balance of testing methods.

Week 11. User acceptance testing: perform gisting evaluation of 2-3 existing language pairs (continued). Analyze and summarize the results.

Week 12. Resolve issues found during testing. Refine code and wrap-up.

End product: Gisting evaluation toolbox with text and web interfaces

References

[1] Jim O'Regan, Mikel L. Forcada: Peeking through the language barrier: the development of a free/open-source gisting system for Basque to English based on apertium.org. Procesamiento del Lenguaje Natural 51: 15-22 (2013)

[2] Yutaka Matsuo, Mitsuru Ishizuka: Keyword extraction from a single document using word co-occurrence statistical information. International Journal on Artificial Intelligence Tools 13(1): 157-169 (2004)

[3] Trond Trosterud, Kevin Brubeck Unhammer. Evaluating North Sámi to Norwegian assimilation RBMT. Proceedings of the Third International Workshop on Free/Open-Source Rule-Based Machine Translation (FreeRBMT 2012); 06/2012