User:OverPowered/GSoC2021Proposal
Google Summer of Code 2021 Proposal- A Reworked Apertium Browser Plugin
My proposal is to develop the Apertium Browser Plugin mentioned in the Project Ideas for GSoC.
The current Geriaoueg plugin is out of date, with the official link given in the wiki being unreachable and the 2014 version on GitHub being completely unusable on both Firefox and Chrome.
The extension I plan on making will have three main functionalities:*
- Translating a word or phrase that the user hovers on on a website
- Translating between an existing language pair in the extension pop-up
- Translating an entire webpage at a time
I've recorded all the weekly progress of this project in its Progress Report
Contact Details[edit]
- Name: Omkar Prabhune
- IRC: op
- Wiki: OverPowered
- Official Email: omkar.prabhune19@vit.edu
- Personal (Google) Email: omkar.prabhune.317@gmail.com
- GitHub: OverPoweredDev
- LinkedIn: Omkar Prabhune
- Timezone: UTC +5:30 or IST
About Me[edit]
I’m a undergraduate Computer Science Student at VIT, Pune. I’ve been interested in Natural Language Processing and Linguistics for a couple years now, having worked on basic projects like a Rasa-based Chatbot and Specialised Semantic Searches before.
Apart from those I also have decent experience working on browser plugins, Some examples of that here and here.
Why am I interested in Apertium?[edit]
As someone from India, where it’s somehow the norm to be trilingual or even quadrilingual before 20, I’ve grown to have a pretty decent appreciation for languages in general, both real life and programming. So Apertium and it’s unique approach of not using popular deep-learning based solutions and instead opting to preserve linguistic diversity for endangered languages has been an organisation I’ve been wanting to contribute for a while
Hence, I’ve been working on projects in Natural Language Processing for a couple years as an Undergrad. That said, I’m still not confident enough to bring a complete language pair up to release in this short timeline but I still want to contribute.
Having made a couple extensions for chrome and firefox before, I can also confidently say that I am able to deliver a finished product with all features within the timeframe given.
So to answer the title, both because of my excitement for the topic and ability to contribute to this project
Proposal[edit]
Which task am I interested in? What do I plan to do?[edit]
My proposal is to develop the Apertium Browser Plugin mentioned in the Project Ideas for GSoC.
The current Geriaoueg plugin is out of date, with the official link given in the wiki being unreachable and the 2014 version on GitHub being completely unusable on both Firefox and Chrome.
The extension I plan on making will have three main functionalities:*
- Translating a word or phrase that the user hovers on on a website
- Translating between an existing language pair in the extension pop-up
- Translating an entire webpage at a time
Benefits to the Community/ Why should Apertium sponsor this?[edit]
Currently the Apertium platform can be used either offline or directly on its website. A browser extension is a great way to add another platform for end users to use. Moreover, for translation functions an extension is arguably better than a website in providing translation features for a user, since it is both easier to use and will operate within the webpage itself.
Also considering that there’s much less time this GSoC, an extension is a project that is definitely deliverable and will greatly enhance the useability of the Apertium Project. Compared to something longer and more time-intensive like bringing a new language pair to release, I can guarantee that this proposal can be developed into a product within the timeline given
Implementation Plan[edit]
Most of the work done will be in the extension/plugin with support of the Apertium Apy for the real heavy lifting i.e. Language Identification and Translation. Because of this, it is easier to fit into this year’s shorter GSoC Timeline.
The three main types of POST requests I will be using to make this are:
- /identifyLang - to first identify which language we’re translating from
- /translate - for translating words the user hovers on
- /translateDoc - for translating entire webpages (might possible have to use /translate for that as well depending on how well it plays with most websites)
Note: In order to translate entire web pages more effectively, as <TinoDidriksen> suggested, it might be more efficient/future-proof to tag all ‘to be translated’ inline elements with a class in a completely separate transport html document and then pass this new document to /translateDoc to be translated.
Another addition to the project is adding context-based hover translations. Basically, the idea is to work upon the feature added in last year’s GSoC, Markup handling with wordbound blanks. However instead of using this to maintain markup formatting through documents, we bind the original word through the translation pipeline and then display this context based translation upon hover.
This can also be implemented in the reverse direction i.e. upon translating an entire document, the extension can display the original word that it was translated from.
As for the extension itself, it will first ask users which language they want to translate to when it is first installed, this will be saved as the go to language when showing translation over hovers or in the extension pop-up.
After this initial setup, the extension is mainly listening for two events, either the user hovers over a word or asks to translate either a custom phrase or the entire web page in the pop-up.
Hover[edit]
- In this case, the extension first finds the exact word hovered on using jQuery’s mouseover() method to find the relevant div and then the exact word with the mouse coordinates.
- The language of the word is then found using the /identifyLang functionality of the API,
- And then subsequently translated using the /translate functionality.
- The hovering text above the word will display language it was translate from, as well as its meaning
Pop-Up[edit]
- Here the user enters the language they are translating from and to
- Past that it’s fairly similar to the actual website, where there’s one input text bar, one output and two dropdowns to select the language
- Naturally, there will be a Detect Language option for the input language
- A button below this would give the option to translate the entire web page
Deliverable[edit]
A browser plugin supported at least on both Firefox and Chrome able to translate individual words, and entire webpages (To be tested on websites like wikipedia, BBC News, The Economic Times, and even social media like Facebook, Reddit, Twitter, etc.).
Timeline[edit]
Phase 1[edit]
Community Bonding Period
(May 17 - June 7) |
|
Week 1
(June 7 - June 14) |
Set up basic extension on chromium that can:
|
Week 2
(June 14 - June 21) |
|
Week 3
(June 21 - June 28) |
|
Week 4
(June 28 - July 5) |
|
Week 5
(July 5 - July 12) |
|
Deliverable #1 |
Browser plugin that can translate words hovered on or those typed into its input pop-up. Implementable on the most popular Chromium based browsers (Chrome, FireFox, Edge, Brave) |
Phase 2[edit]
Week 6
(July 16 - July 23) |
|
Week 7
(July 23 - July 30) |
|
Week 8
(July 30 - August 6) |
|
Week 9
(August 6 - August 13) |
The extension should be completely functional by now. All that’s left at this point will be:
|
Week 10
(August 13 - August 16) |
Intentionally kept free, so as to sort out any issues that crop up before this and cause any unforeseen delay |
Deliverable #2 |
The project will be complete at this point. A browser plugin supported at least on both Firefox and Chrome able to translate individual words, and entire webpages (Tested on websites like wikipedia, BBC News, The Economic Times, and even social media like Facebook, Reddit, Twitter, etc.). |
Other Summer Plans[edit]
Of which I have none so I’ll be free to work on this full time