Ideas for Google Summer of Code/Apertium assimilation evaluation toolkit

From Apertium
Jump to navigation Jump to search

This was a 2014 project, see Assimilation_Evaluation_Toolkit

Many Apertium language pairs are designed for assimilation (gisting) purposes. The evaluation described would measure how helpful they are in the task.

Starting from files containing sentences in the source language and reference translations, generate tests for human evaluation consisting of:

  1. (optionally) the source sentence,
  2. (optionally) the machine-translated version of the source sentences and
  3. a reference translation of the sentence in which one or more content words have been deleted.

The idea is to measure how the ability of human subjects to fill in the holes improves when the source or a machine translation of it are presented. The task involves also generating a program that computes the success as a function of the information presented to the user, and utilities to make the whole process automatic given an Apertium language pair.

There should be both a text-based interface, and a web-based interface.

Tasks[edit]

Coding challenge[edit]

  • Install Apertium
  • Perform an assimilation evaluation of a language pair of your choice with a text no shorter than 300 words.
    • The further away the languages the better

Frequently asked questions[edit]

  • none yet, ask us something! :)

See also[edit]