User:Mono/GSoC 2017
Apertium is a free/open-source platform for rule-based machine translation and language technology which is aimed providing support for lesser-resourced and marginalized languages. The current interface of Apertium is already pretty awesome. However, adding a few more functionalities such as webpage translation, spellchecker interface and the dictionary lookup feature would make this platform even more awesome. My GSoC project has revolved around implementing these features along with making the interface more robust.
I would like to thank my mentors Sushain, Jonathan, Xavivars, Unhammer, TinoDidriksen and the entire Apertium community for helping and guiding me throughout the course of this project. All that was accomplished wouldn't have been remotely possible without the support of my mentors. I would also thank Google Summer of Code community to provide me with the platform where I could learn and build my skillset and quench my thirst for open source contribution.
The further part of the wiki mentions about the work I have accomplished during the period of GSoC 2017.
Contents
- 1 Webpage Translation mode
- 2 SpellChecker mode
- 3 Dictionary Lookup mode
- 4 Suggestions Interface
- 5 Installation Notification
- 6 POST v/s GET
- 7 Language Dropdown going offscreen Issue
- 8 LTR/RTL alignment of languages in dropdown
- 9 Interface breaks when cookies are disabled issue
- 10 Improve detectLanguage() functionality
- 11 Prevent the requests when input is empty
- 12 Improvement of Functionalities
- 13 Miscellaneous Issues
- 14 Important Links
Webpage Translation mode
An interface that lets the user to input a URL, choose a source language and a destination language and translate the webpage. This feature has been successfully completed as a part of my GSoC project! Both the frontend as well as the backend for this feature have been merged into the main project.
Code
Backend
Frontend
- https://github.com/apertium/apertium-html-tools/pull/154
- https://github.com/apertium/apertium-html-tools/pull/202
- https://github.com/apertium/apertium-html-tools/pull/205
- https://github.com/apertium/apertium-html-tools/pull/208 (The merged PR)
Documentation
Backend
URL | Function | Parameters | Output |
---|---|---|---|
/translatePage | Translates a webpage |
|
Returns the translated webpage
curl -Ss 'http://localhost:2737/translatePage?langpair=eng|spa&url=http://facebook.com' |
Frontend
ENABLED_MODES: an array of the enabled interfaces, a non-empty subset of ['translation', 'analyzation', 'generation', 'sandbox']
translation lookup
turns on webpage translation mode.
The backend for this mode is merged through this commit. The frontend for this project is merged through this PR The screenshot of the current state of interface can be found here.
Future Work
Make use of a form handler while submitting the URL links for translation. The related issues are in this comment.
SpellChecker mode
Checks for the spelling of input text for a given language and suggests alternatives if the spelling is wrong.
Code
Backend
Frontend
Documentation
Backend
URL | Function | Parameters | Output |
---|---|---|---|
/speller | Performs spellchecking on a given text for a given language |
|
Returns the spellchecking results
curl -Ss 'http://localhost:2737/speller?lang=hin&q=माय' | ascii2uni -a U -q [{"sugg": [["काय", "1.000000"], ["चाय", "1.000000"], ["राय", "1.000000"], ["हाय", "1.000000"], ["साय", "1.000000"], ["मा", "1.000000"], ["वाय", "1.000000"], ["दाय", "1.000000"], ["गाय", "1.000000"], ["जाय", "1.000000"]], "known": false, "token": "माय"}] |
Frontend
ENABLED_MODES: an array of the enabled interfaces, a non-empty subset of ['translation', 'analyzation', 'generation', 'sandbox', 'speller']
speller
turns on spell checking mode.
The screenshot of the current state of interface can be found here.
Future Work
Improving the logic of mapping the suggestions returned from the backend for the tokens appropriately to the corresponding text on the frontend.
Dictionary Lookup mode
An interface that generates all forms of a given word. It renders the definitions of a given word for a given language pair after translating them.
Code
Backend
Frontend
Documentation
Backend
URL | Function | Parameters | Output |
---|---|---|---|
/dictionaryLookup | Generate dictionary forms of a given word |
|
Returns the possible forms of after translation
curl -Ss 'http://localhost:2737/dictionaryLookup?langpair=eng|spa&q=light' {"vblex": ["encender", "iluminar"], "n": ["luz"], "adj": ["ligero", "claro"]} |
Frontend
ENABLED_MODES: an array of the enabled interfaces, a non-empty subset of ['translation', 'analyzation', 'generation', 'sandbox']
translation lookup
turns on dictionary lookup mode.
The screenshot of the current state of interface can be found here.
Future Work
The pending tasks with respect to dictionary lookup mode are discussed in this comment.
Suggestions Interface
An interface that lets the user insert suggestions on the wiki page.
Code
Frontend
Backend
Future Work
This feature had just began. Focus was first put on completing the above 3 features before progressing on this one. Thus, there is no documentation on this feature as a part of my project. The future tasks for this feature would involve enhancing both the frontend as well as the backend code, testing the functionality and then creating a pull request for the same.
Installation Notification
1. A notification that appears when the requests made to the APy take more than a threshold time.
2. This notification also appears when an average of the duration of requests exceeds a certain threshold indicating that the servers may be overloaded in that particular time phase and thus, one could set the APy locally too.
3. At any point, we maintain a queue of duration of requests with a certain maximum size. If the size of the queue exceeds this threshold, we dequeue a duration and enqueue the duration of the latest request. This ensures a moving average and helps determine if the load on the server has reduced.
Code
1. An issue that was observed here was, a variable apyRequestStartTime stored the timestamp when an AJAX request is made through callApy method. This variable was not cleared after the execution of request. Thus, if an AJAX request is made which is not handled through callApy(), on completion, it used up the start timestamp of the previous request and thus, the difference between the timestamp at which the request completes and the previous start timestamp almost always exceeded the threshold. This erroneously displayed the notification.
The following patch resolved the above issue.
Code
POST v/s GET
1. Initially, the AJAX requests made use of GET method to retrieve data from the backend.
2. The GET method was used along with jsonp to allow cross domain requests. However, this gave a 414-request URI too large error when the input size was large and thus, resulted in failed requests.
3. This issue was resolved by making use of a POST method if the request size was beyond a threshold size, and a GET method otherwise.
Code
Language Dropdown going offscreen Issue
1. The language dropdowns of the source languages and the destination languages used to go off-screen when the browser window size was adjusted. This would obstruct the user from choosing the language of his choice.
2. This issue was fixed by dynamically determining the available space on the browser window (triggered on resize) and adjusting the number of columns to fit the languages inside the viewport.
Code
LTR/RTL alignment of languages in dropdown
1. Inspite of setting a left-to-right or right-to-left orientation for the language display names, the browser did not render it in the expected manner.
2. A patch was created which applied the necessary styling to the display names along with the styling of other associated UI elements to achieve the right rendering.
Code
Interface breaks when cookies are disabled issue
1. The Apertium interface used to break when the cookies were disabled.
2. This was because the interface used to interact with the localStorage of the browser and when the cookies were disabled, this interaction was prohibited by the browser. This was not handled in the code.
3. The issue was resolved by handling the exception that occurs when the cookies were disabled.
Code
Improve detectLanguage() functionality
1. The detectLanguage() method did not call the autoDstSelectLang() method to detect a destination language automatically after the langauge for a given text was identified.
Code
Prevent the requests when input is empty
1. The handlers on the backend gave an internal server error when the requests were made with empty inputs or if any of the necessary arguments were missing.
2. This validation was added for a lot of functionalities such as that of Analyzer, Generator, Detect Language, APy Sandbox.
Code
- https://github.com/apertium/apertium-html-tools/commit/f274430c648a9d2fabb5b76f88a226420f14449f
- https://github.com/apertium/apertium-html-tools/commit/928f43a205355549580174bc37bcbdb3b5cd29e8
- https://github.com/apertium/apertium-html-tools/commit/e8fa8dc755a065a1f568ee52a38cd01f1fd0188d
- https://github.com/apertium/apertium-html-tools/commit/2518e5042da41bc1007f586569c8bd31d79eb63a
- https://github.com/apertium/apertium-html-tools/commit/50f4a701e1d0bc2e986ca3c500d6909ee54322f3
Improvement of Functionalities
1. The swap button did not swap the source language and destination language on smaller screens.
2. The translate button did not call the translate() method on smaller screens.
3. The Detect Language button was active on docTranslation interface whereas the detection it used to perform was for the input text on translateText interface.
4. Calling appropriate translate() method based on the interface on which it is called.
5. Fixing the container animation issues. When the interface was switched between containers rapidly, the animation used to break and it would render a blank screen.
6. The language selectors used to overlap with the swap button for a certain set of recent source languages.
7. Adding a button that takes the user to the top of the webpage.
8. APY to APy stylizations.
9. Alignment of Translate, Analyze and Request buttons with their respective textareas on the interface.
10. Execute translate() method as soon as any of source languages or destination languages is changed. (so that it executes even on docTranslation interface)
The above issues were resolved through following patches:
Code
- https://github.com/apertium/apertium-html-tools/commit/1859e7c2db9d46a7237c2117c39d2130df7305f5
- https://github.com/apertium/apertium-html-tools/commit/5e1d43117ed5f05b11092099e70cdd474c2348e1
- https://github.com/apertium/apertium-html-tools/commit/daee7bd9989cd8030a79907a7b39e00a1343580b
- https://github.com/apertium/apertium-html-tools/commit/86034c308835059eed0407b83a94f0599dfe4cb5
- https://github.com/apertium/apertium-html-tools/commit/26c02d5f7fd608265157bc0022b8cf056cc8c59c
- https://github.com/apertium/apertium-html-tools/commit/e6955c7778c651e55c9ce6acb1d44effd7d6d2b1
- https://github.com/apertium/apertium-html-tools/commit/4218f48b7b2a15647989558c7a39970fcc158705
- https://github.com/apertium/apertium-html-tools/commit/203de5af21bcd60810592e394ac2de589bdc78ea
- https://github.com/apertium/apertium-html-tools/commit/5e8cb605007fbf28a09ff1ab5479d9222b8cb21c
- https://github.com/apertium/apertium-html-tools/commit/28c923c93971f428c075fd8c67e75dd9656d814f
Miscellaneous Issues
1. Mark unknown checkbox to be sent with docTranslation interface.
2. Textarea sizes getting restored on page resize.
Pull requests have been created to solve the above issues.
Code
- https://github.com/apertium/apertium-html-tools/pull/180
- https://github.com/apertium/apertium-html-tools/pull/152
Important Links
1. Apertium Wiki
2. Apertium Web Interface
3. Aperium html-tools github
4. Apertium APy github
5. Apertium html-tools forked repo github
6. Apertium APy forked repo github
7. Commits to master (pull requests that got merged):
Frontend:
Backend:
8. Issues opened by me:
Frontend:
- https://github.com/apertium/apertium-html-tools/issues?q=is%3Aissue+author%3Ashare-with-me+is%3Aopen
- https://github.com/apertium/apertium-html-tools/issues?q=is%3Aissue+author%3Ashare-with-me+is%3Aclosed
Backend:
9. Pull requests by me:
Frontend:
- https://github.com/apertium/apertium-html-tools/pulls/share-with-me
- https://github.com/apertium/apertium-html-tools/issues?q=is%3Apr+author%3Ashare-with-me+is%3Aclosed
Backend: