Difference between revisions of "Ideas for Google Summer of Code/Make a language pair state-of-the-art"

From Apertium
Jump to navigation Jump to search
Line 17: Line 17:
   
 
==See also==
 
==See also==
  +
* an example work plan for a language pair: http://wiki.apertium.org/wiki/Maltese_and_Arabic/Work_plan
 
   
   

Revision as of 08:38, 20 March 2014

Take a released language pair, and drastically improve the performance both in terms of coverage, and in terms of translation quality. This will involve working with dictionaries, transfer rules, scripting, corpora. The objective is to make an Apertium language pair state-of-the-art, or close to state-of-the-art in terms of translation quality. This will involve improving coverage to 95-98% on a range of corpora and decreasing word error rate by 30-50%. For example if the current word error rate is 30%, then it should be reduced to 15-20%.

Coding challenge

  • Find a language pair of your choice.
  • Translate 2,000 words of text (e.g. four articles of 500 words)
  • Postedit the text to make a reference translation.
  • Use two articles to improve the translator.
    • Add all the words, and cover all the structures with transfer rules.
  • Calculate the improvement that you were able to make on these two articles, and on your two held out articles.


Frequently asked questions

  • none yet, ask us something! :)

See also