User:OverPowered/GSoC2021 Progress Report
Google Summer of Code 2021 Progress Report- A Reworked Apertium Browser Plugin
Contents
- 1 Community Bonding Period (May 17 - June 7)
- 2 Week 1 (June 7 - June 14)
- 3 Week 2 (June 14 - June 21)
- 4 Week 3 (June 21 - June 28)
- 5 Week 4 (June 28 - July 5)
- 6 Week 5 (July 5 - July 12)
- 7 Week 6 (July 16 - July 23)
- 8 Week 7 (July 23 - July 30)
- 9 Week 8 (July 30 - August 6)
- 10 Week 9 (August 6 - August 13)
- 11 Week 10 (August 13 - August 16)
Community Bonding Period (May 17 - June 7)
Checklist
Understand better how the Apertium API worksIdentify parts of the Geriaoueg extension that still workCheck Source Code of similar extensions- Design tests, experiments and evaluation procedures for an extension (which sites it should be able to read, etc.)
Write a workflow diagram of the improved extension, what background processes it will have and what permissions it will need
Progress
I've gone through the Apertium Apy code and built it from source. The parts that I need are outlined in the initial proposal and I don't feel I need to add anything more. As for the Geriaoueg extension, given how outdated it is, it's better to build a new one from scratch with web-ext than to use parts of it.
As for other extensions, I've been looking at OnHover Translate for inspiration regarding layout and background processes. I also believe it's best not to copy the exact method they've used there and opt for a normal popup.html instead of a element created with JavaScript on the page itself, if only to keep it simpler.
As for the sites it should be able to read, a good start would be wiki's, news sites and more popular social media websites like Reddit, Facebook, etc. It should also be able to operate on the Google Search page. I'm still learning about implementing unit, integration and system tests for a web extension. And it seems to be more complex than I previously thought, so it might take me a bit more time.
The permissions I would need for the extension are not too many. Storage will be required only for storing existing language pairs and the default target language, so the limit of 5mb will be more than enough. The only permissions I feel are needed are `tabs`, `contextmenus` if I add any anything to the right-click menu and `clipboardWrite` to allow copying data with a button.
Layout
The Layout is mostly inspired by the layout of the actual Apertium website. I created three mockups of the extension, with the first one being completely identical to the website. The second one is a bit more compact, with the input and output text areas positioned vertically. As for the third mockup, it is the one most geared to be an extension, with a single text bar used for both input and output.
As for the checkbox at the bottom, initially I felt it best to include a 'Disable on this Website' check to disable the hover-in functionalities of the extension. But owing to concerns about increased traffic for the API, it seems a better idea to set the option to 'Enable on this Website'.
Also while easy to change later, I'll be going for the second layout as a single input bar would make copy/paste operations harder.
Week 1 (June 7 - June 14)
Checklist
Set up a basic extension on chromium that can:
Detect the word it hovers on (on websites like wikipedia, news websites, etc.)Shows a basic popup when clicked on
Progress
Most basic pop-up functionalities are complete, to the point of marking some goals off of next week's checklist. There's also the matter of the word hover-on gist, which I've covered a bit in the next part. Also, the mockups for the pop-up from last week are complete, with both the front part and the settings. Word Hover has been implemented as of 13/7. Right now it is only capable of highlighting all the words nested inside a ≶p> tag.
Detecting a Hovered Word
I've been looking into the word-hover function and I had two leads -
One, wrap every single word in a or better yet, a custom element like say <hover> or something (Because messing with 's might absolutely destroy some webpages) which shows a hovered translation. An example of that would be this site. It's machine translation for native Chinese webnovels but the webpage source is much more interesting. And most of this is also tied in to the html of the page, which seems much better than the JavaScript hell that is the next option.
The Second option is to get the position of the cursor from the browser (major browser-compatibility issues right there) and then traverse the entire DOM-tree looking for the exact leaf-node the cursor is on. Example for this one is in section 2 of this StackOverflow Answer. And sidenote, this is an old answer but it covers the method pretty well.
For now I'm implementing the custom tag method and figuring out a more elegant way to accomplish the second one.
Week 2 (June 14 - June 21)
Checklist
If needed, this is the time to tweak Apy functionalities slightly but should not be necessaryApart from that, actually hook up the extension to use the API functionalities to translate in the pop-upProperly set up the html pop-up of the extensionValidate input given to pop-upEnable different options for translation
Progress
Most of these were actually done inadvertently during the Community Bonding Period so I'll focus on getting tests up and running once all of these are done.
Week 3 (June 21 - June 28)
Checklist
Start work on hover functionalityDesign the hovering text-box and the details that will be shown on itImplement translation features for hover function too
Progress
Finally the main part of the project! The basic translation functions and all have been set up, I just have to show it in the form of a tool-tip or a hovering box. Work on this has been completed fairly early in the week.
Hover-On Translation
For the hover-in translation, displaying the hover-box itself is implemented entirely without javascript by wrapping every word up in a <hover> tag with two data attributes to represent information in it and its position. All the html within a <p> tag is replaced using the regex pattern:
/(?![^<]*?>)([A-z0-9']+)/g
Referred to this stackoverflow answer for designing the tooltips
Week 4 (June 28 - July 5)
Checklist
Start work on the translate entire document featureExperiment using the /translateDoc functionality for the .html pageIf this doesn't work, translate the document section by section
Progress
So far so good, document translation gave a few problems, mainly finding and separating all the elements that needed translation. Ultimately it is being done section by section, just by passing it as a document.
Week 5 (July 5 - July 12)
Checklist
While it was made in chromium, this week is for making sure there's no problems implemented whatever's done so far on other browsers (in order: Chrome, FireFox, Edge)Final Touch Ups on the current functionalities of the extensionStart working on documentation
Progress
Document translation took more time than I thought. Finally finished this week with help from this gist. I've been debugging this thing the entire time so in hindsight this point really doesn't make sense. It's a similar case with documentation. Might start setting up tests this week too.
Week 6 (July 16 - July 23)
Checklist
Start work on inline-gist translation i.e. the context-based hover translationExperiment using the wordbound blanks made in last year's GSoC
Progress
Delaying the contextual translation for a bit to finish the tests.
Testing
Tests are implemented using Puppeteer and Mocha. Puppeteer is used in headful mode to run extensions, and Mocha is the main testing library to assert and test. Installation Instructions and all are there on the Github Repo.
Week 7 (July 23 - July 30)
Checklist
Continue working on the previous week's startIf it has been implemented in one direction, then work on getting it working in the reverse too
Progress
Kinda late but it'll align with the checklist next week. Started (and finished) Contextual Translation this week.
Contextual Translation
The existing hover-tags (now <span>
's instead of <hover>
) work very well with context translation. They're inline tags so they stick to their word in the pipeline and can retain the original meaning through a data-original attribute. After that, it's just a bit of parsing and replacement to get contextual translation
Week 8 (July 30 - August 6)
Checklist
During this week, the tests prepared in the Community Bonding Period will be used to test all the functionalities implemented so far.If needed, changes will be made but the project should be up and running at this point.Apart from that, write documentation for the extension
Progress
Tests have been up for a while now so there's that. The extension has also been complete for the most part, with all requirements on the GSoC checklist marked as done. Unless this counts, I'll make the page for Apertium Webext and the GSoC writeup gist soon.
Week 9 (August 6 - August 13)
Checklist
Minor bugfixes UI redesign Small optimisations in code
Week 10 (August 13 - August 16)
Checklist (Or lack of one)
Intentionally kept free, so as to sort out any issues that crop up before this and cause any unforeseen delay