User:OverPowered/GSoC2021 Progress Report

From Apertium
< User:OverPowered
Revision as of 08:02, 14 August 2021 by OverPowered (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Google Summer of Code 2021 Progress Report- A Reworked Apertium Browser Plugin

Community Bonding Period (May 17 - June 7)


Extension Mockup 1
Extension Mockup 2
Extension Mockup 3
  • Understand better how the Apertium API works
  • Identify parts of the Geriaoueg extension that still work
  • Check Source Code of similar extensions
  • Design tests, experiments and evaluation procedures for an extension (which sites it should be able to read, etc.)
  • Write a workflow diagram of the improved extension, what background processes it will have and what permissions it will need


I've gone through the Apertium Apy code and built it from source. The parts that I need are outlined in the initial proposal and I don't feel I need to add anything more. As for the Geriaoueg extension, given how outdated it is, it's better to build a new one from scratch with web-ext than to use parts of it.

As for other extensions, I've been looking at OnHover Translate for inspiration regarding layout and background processes. I also believe it's best not to copy the exact method they've used there and opt for a normal popup.html instead of a element created with JavaScript on the page itself, if only to keep it simpler.

As for the sites it should be able to read, a good start would be wiki's, news sites and more popular social media websites like Reddit, Facebook, etc. It should also be able to operate on the Google Search page. I'm still learning about implementing unit, integration and system tests for a web extension. And it seems to be more complex than I previously thought, so it might take me a bit more time.

The permissions I would need for the extension are not too many. Storage will be required only for storing existing language pairs and the default target language, so the limit of 5mb will be more than enough. The only permissions I feel are needed are `tabs`, `contextmenus` if I add any anything to the right-click menu and `clipboardWrite` to allow copying data with a button.


The Layout is mostly inspired by the layout of the actual Apertium website. I created three mockups of the extension, with the first one being completely identical to the website. The second one is a bit more compact, with the input and output text areas positioned vertically. As for the third mockup, it is the one most geared to be an extension, with a single text bar used for both input and output.

As for the checkbox at the bottom, initially I felt it best to include a 'Disable on this Website' check to disable the hover-in functionalities of the extension. But owing to concerns about increased traffic for the API, it seems a better idea to set the option to 'Enable on this Website'.

Also while easy to change later, I'll be going for the second layout as a single input bar would make copy/paste operations harder.

Week 1 (June 7 - June 14)


Pop-Up Main
Pop-Up Settings

Set up a basic extension on chromium that can:

  • Detect the word it hovers on (on websites like wikipedia, news websites, etc.)
  • Shows a basic popup when clicked on


Most basic pop-up functionalities are complete, to the point of marking some goals off of next week's checklist. There's also the matter of the word hover-on gist, which I've covered a bit in the next part. Also, the mockups for the pop-up from last week are complete, with both the front part and the settings. Word Hover has been implemented as of 13/7. Right now it is only capable of highlighting all the words nested inside a &lg;p> tag.

Detecting a Hovered Word

I've been looking into the word-hover function and I had two leads -

One, wrap every single word in a or better yet, a custom element like say <hover> or something (Because messing with 's might absolutely destroy some webpages) which shows a hovered translation. An example of that would be this site. It's machine translation for native Chinese webnovels but the webpage source is much more interesting. And most of this is also tied in to the html of the page, which seems much better than the JavaScript hell that is the next option.

The Second option is to get the position of the cursor from the browser (major browser-compatibility issues right there) and then traverse the entire DOM-tree looking for the exact leaf-node the cursor is on. Example for this one is in section 2 of this StackOverflow Answer. And sidenote, this is an old answer but it covers the method pretty well.

For now I'm implementing the custom tag method and figuring out a more elegant way to accomplish the second one.

Week 2 (June 14 - June 21)


  • If needed, this is the time to tweak Apy functionalities slightly but should not be necessary
  • Apart from that, actually hook up the extension to use the API functionalities to translate in the pop-up
  • Properly set up the html pop-up of the extension
  • Validate input given to pop-up
  • Enable different options for translation


Most of these were actually done inadvertently during the Community Bonding Period so I'll focus on getting tests up and running once all of these are done.

Week 3 (June 21 - June 28)


  • Start work on hover functionality
  • Design the hovering text-box and the details that will be shown on it
  • Implement translation features for hover function too


Finally the main part of the project! The basic translation functions and all have been set up, I just have to show it in the form of a tool-tip or a hovering box. Work on this has been completed fairly early in the week.

Hover-On Translation

For the hover-in translation, displaying the hover-box itself is implemented entirely without javascript by wrapping every word up in a <hover> tag with two data attributes to represent information in it and its position. All the html within a <p> tag is replaced using the regex pattern:


Referred to this stackoverflow answer for designing the tooltips

Week 4 (June 28 - July 5)


  • Start work on the translate entire document feature
  • Experiment using the /translateDoc functionality for the .html page
  • If this doesn't work, translate the document section by section


So far so good, document translation gave a few problems, mainly finding and separating all the elements that needed translation. Ultimately it is being done section by section, just by passing it as a document.

Week 5 (July 5 - July 12)


  • While it was made in chromium, this week is for making sure there's no problems implemented whatever's done so far on other browsers (in order: Chrome, FireFox, Edge)
  • Final Touch Ups on the current functionalities of the extension
  • Start working on documentation


Document translation took more time than I thought. Finally finished this week with help from this gist. I've been debugging this thing the entire time so in hindsight this point really doesn't make sense. It's a similar case with documentation. Might start setting up tests this week too.

Week 6 (July 16 - July 23)


  • Start work on inline-gist translation i.e. the context-based hover translation
  • Experiment using the wordbound blanks made in last year's GSoC


Delaying the contextual translation for a bit to finish the tests.


Tests are implemented using Puppeteer and Mocha. Puppeteer is used in headful mode to run extensions, and Mocha is the main testing library to assert and test. Installation Instructions and all are there on the Github Repo.

Week 7 (July 23 - July 30)


  • Continue working on the previous week's start
  • If it has been implemented in one direction, then work on getting it working in the reverse too


Kinda late but it'll align with the checklist next week. Started (and finished) Contextual Translation this week.

Contextual Translation

The existing hover-tags (now <span>'s instead of <hover>) work very well with context translation. They're inline tags so they stick to their word in the pipeline and can retain the original meaning through a data-original attribute. After that, it's just a bit of parsing and replacement to get contextual translation

Week 8 (July 30 - August 6)


  • During this week, the tests prepared in the Community Bonding Period will be used to test all the functionalities implemented so far.
  • If needed, changes will be made but the project should be up and running at this point.
  • Apart from that, write documentation for the extension


Tests have been up for a while now so there's that. The extension has also been complete for the most part, with all requirements on the GSoC checklist marked as done. Unless this counts, I'll make the page for Apertium Webext and the GSoC writeup gist soon.

Week 9 (August 6 - August 13)


  • Minor bugfixes
  • UI redesign(???)
  • Small optimisations in code


...I'm not sure what I was thinking when I put this stuff in the original proposal. Anyways, most of it was done during the original design, code optimisations are possible but the way it is right now is great too. Finally, I'm done with most writeups and all too.

Week 10 (August 13 - August 16)

Checklist (Or lack of one)

Intentionally kept free, so as to sort out any issues that crop up before this and cause any unforeseen delay

Only thing left is finishing the Wiki page and checking for browser compatibility with Firefox Android. With the exception of these, I'm fairly happy with the progress of this project