User:Mono/GSoC 2017

From Apertium
< User:Mono
Revision as of 18:18, 27 August 2017 by Mono (talk | contribs)
Jump to navigation Jump to search

Apertium is a free/open-source platform for rule-based machine translation and language technology which is aimed providing support for lesser-resourced and marginalized languages. The current interface of Apertium is already pretty awesome. However, adding a few more functionalities such as webpage translation, spellchecker interface and the dictionary lookup feature would make this platform even more awesome. My GSoC project has revolved around implementing these features along with making the interface more robust.

I would like to thank my mentors Sushain, Jonathan, Xavivars, Unhammer, TinoDidriksen and the entire Apertium community for helping and guiding me throughout the course of this project. All that was accomplished wouldn't have been remotely possible without the support of my mentors. I would also thank Google Summer of Code community to provide me with the platform where I could learn and build my skillset and quench my thirst for open source contribution.

The further part of the wiki mentions about the work I have accomplished during the period of GSoC 2017.

Webpage Translation mode

An interface that lets the user to input a URL, choose a source and a destination language and translate the webpage.

Code

Backend

Frontend

Documentation

Backend

URL Function Parameters Output
/translatePage Translates a webpage
  • langpair: language pair to use for translation
  • url: url of webpage that has to be translated
Returns the translated webpage
curl -Ss 'http://localhost:2737/translatePage?langpair=eng|spa&url=http://facebook.com'

output

Frontend

ENABLED_MODES: an array of the enabled interfaces, a non-empty subset of ['translation', 'analyzation', 'generation', 'sandbox']

  • translation lookup turns on webpage translation mode.

Future Work

1. The backend for this mode is merged through this commit. The screenshot of the current state of interface can be found here.
2. The interface is pretty much functional. However, a future task is to make use of a form handler while submitting the URL links for translation. The related issues are in this comment.


SpellChecker mode

Checks for the spelling of input text for a given language and suggests alternatives if the spelling is wrong.

Code

Backend

Frontend

Documentation

Backend

URL Function Parameters Output
/speller Performs spellchecking on a given text for a given language
  • lang: language to perform spellchecking for
  • q: text to perform spellchecking on
Returns the spellchecking results
curl -Ss 'http://localhost:2737/speller?lang=hin&q=माय' | ascii2uni -a U -q

[{"sugg": [["काय", "1.000000"], ["चाय", "1.000000"], ["राय", "1.000000"], ["हाय", "1.000000"], ["साय", "1.000000"], ["मा", "1.000000"], ["वाय", "1.000000"], ["दाय", "1.000000"], ["गाय", "1.000000"], ["जाय", "1.000000"]], "known": false, "token": "माय"}]

Frontend

ENABLED_MODES: an array of the enabled interfaces, a non-empty subset of ['translation', 'analyzation', 'generation', 'sandbox', 'speller']

  • speller turns on spell checking mode.

Future Work

1. The screenshot of the current state of interface can be found here.
2. Improving the logic of mapping the suggestions returned from the backend for the tokens appropriately to the corresponding text on the frontend.


Dictionary Lookup mode

An interface that generates all forms of a given word. It renders the definitions of a given word for a given language pair after translating them.

Code

Backend

Frontend

Documentation

Backend

URL Function Parameters Output
/dictionaryLookup Generate dictionary forms of a given word
  • langpair: language pair to use for translation
  • q: word to perform dictionary lookup on
Returns the possible forms of after translation
curl -Ss 'http://localhost:2737/dictionaryLookup?langpair=eng|spa&q=light'
{"vblex": ["encender", "iluminar"], "n": ["luz"], "adj": ["ligero", "claro"]}

Frontend

ENABLED_MODES: an array of the enabled interfaces, a non-empty subset of ['translation', 'analyzation', 'generation', 'sandbox']

  • translation lookup turns on dictionary lookup mode.

Future Work

1. The screenshot of the current state of interface can be found here.
2. The pending tasks with respect to dictionary lookup mode are discussed in this comment.


Installation Notification

1. A notification that appears when the requests made to the APy take more than a threshold time.
2. This notification also appears when a cumulative average of the requests duration exceeds a certain threshold indicating that the servers may be overloaded in that particular time phase and thus, one could set the APy locally too.

Code

1. An issue that was observed here was, if an AJAX request is made which is not handled through callApy(), it used to take up the apyRequestStartTime value of the previously executed callApy call, execute handleApyRequestCompletion() in ajaxComplete() and thus, that difference being greater than the threshold, installation notification was always shown.
2. The following patch resolved the above issue.

Code


POST v/s GET

1. Initially, the AJAX requests made use of GET method to retrieve data from the backend.
2. The GET method was used along with jsonp to allow cross domain requests. However, this gave a 414-request URI too large error when the input size was large and thus, resulted in failed requests.
3. This issue was resolved by making use of a POST method if the request size was beyond a threshold size, and a GET method otherwise.

Code


Language Dropdown going offscreen Issue

1. The language dropdown used to go offscreen when the browser window size was adjusted. This would obstruct the user from choosing the language of his choice.
2. This issue was fixed by dynamically determining the available space on the browser window (triggered on resize) and adjusting the number of columns to fit the languages inside the viewport.

Code


LTR/RTL alignment of languages in dropdown

1. Inspite of setting the left-to-right or right-to-left orientation for the language display names, the browser did not render it in the expected manner.
2. A patch was created which applied the necessary styling to the display names along with the styling of other associated UI elements to achieve the right rendering.

Code


Interface breaks when cookies disabled

1. The Apertium interface used to break when the cookies were disabled.
2. This was because the interface used to interact with the localStorage of the browser and when the cookies were disabled, this interaction was prohibited by the browser. This was unhandled in the code.
3. The issue was resolved by handling the exception that occurs when the cookies were disabled.

Code


Improve detectLanguage() functionality

1. This method did not call the autoDstSelectLang() method to detect a destination language automatically after the langauge for a given text was identified.

Code


Prevent the requests when input is empty

1. The handlers on the backend gave a server error when the requests were made with empty inputs or if any of the necessary arguments were missing.
2. This validation was added for a lot of functionalities such as that of Analyzer, Generator, Detect Language, APy Sandbox.

Code


Improvement of Functionalities

1. The swap button did not swap the source and destination language on smaller screens.
2. The translate button did not call the translate() method on smaller screens.
3. The Detect Language button was active on docTranslation interface whereas the detection it used to perform was for the input text in translateText interface.
4. Calling appropriate translate() method based on the interface on which it is called.
5. Fixing the container animation issues. When the interface was switched between containers rapidly, the animation used to break and it would render a blank screen.
6. The language selectors used to overlap with the swap button for a certain set of recent source languages.
7. Adding a button that takes the user to the top of the webpage.
8. APY -> APy stylization.
9. Alignment of Translate, Analyze and Request buttons with their respective textareas on the interface.
10. Execute translate() method as soon as any of source or destination languages is changed. (so that it executes even on docTranslation interface)
11. The above issues were resolved through following patches:


Code


Important Links

The following are the important links with regards to this project:
1. Apertium Wiki

2. Apertium Web Interface

3. Aperium html-tools github

4. Apertium APy github

5. Apertium html-tools forked repo github

6. Apertium APy forked repo github

7. Commits to master (pull requests that got merged):
Frontend:

Backend:

8. Issues opened by me:
Frontend:

Backend:

9. Pull requests by me:
Frontend:

Backend: