Task ideas for Google Code-in/Language detection in simple-html and apertium-apy

From Apertium
Jump to navigation Jump to search


The language detection in the simple-html interface currently uses a 2.9M javascript file. The objective of this set of tasks is to get it to use apertium-apy for the language detection.


Implement language detection in apertium-apy

Currently apertium-apy does not do language detection. Make a new function that allows the language of some input text to be identified. This function should return a dict of languages and probabilities. For this task you will also need to train models for the language identifier.

Localised 'available languages' in apertium-apy

Make a new function for apertium-apy, is takes as input a language code, and as output gives the list of available pairs, and their translations in the language specified by the language code. You will probably need to know JavaScript and Python.

Use external language detection in simple-html

Make the simple-html interface not use 2.9MB javascript module for language detection/identification. Instead it should query apertium-apy with text to get a list of languages with probabilities.

Interface behaviour for language guessing

Based on results of language detection function, make simple-html highlight in the menu the n (e.g., 3) most probable languages, and select the most probable.