User talk:Sourcemorph

From Apertium
Revision as of 15:10, 24 March 2009 by Sourcemorph (talk | contribs)
Jump to navigation Jump to search

Name: Mohit Verma

Email address: mohit.verma.in@gmail.com

IRC nick: sourcemorph, sourcemorph1

Google Talk: mohit.verma.in@gmail.com

Phone Number: +91 9823792900

Why is it you are interested in machine translation?

The world is a small place today, as people from varying cultural and linguistic backgrounds frequently interact with each other. A lot of this interaction is achieved over a computer (perhaps a desktop or an embedded device). Machine translation gives a computer the power to help its user cross the language barrier. The computer-based content meant for us, or the prospective audience for our content need not be contained by our knowledge of languages as the computer can be used to bridge that gap.

Why is it that you are interested in the Apertium project?

The Apertium Project has, apart from technological value, a lot of humanitarian purpose. It aims at preserving the marginalized languages, apart from supporting the more largely spoken ones. This, in my opinion, will also help in making computers reach out to people who speak in these languages. Most common software is not available in these languages and these people may find it comfortable if it was otherwise. The Apertium project can be used to bridge this language barrier, and software can be made to interact with a wider user base, even those whose languages, perhaps, couldn't boast of having a lot of speakers. As mentioned in the wiki, it provides a platform for people to build any number of translation systems, which in a long run can be user to cover, possibly, every language in the world. At a personal level, I am interested in creating utility-based applications using the Apertium project to help people bridge the language gap, as mentioned earlier. I'd also like to contribute to the development of the language pair that I am comfortable with, that is Hindi-English.

Which of the published tasks are you interested in?

I am interested in the Interface task mentioned in the idea list.

What do you plan to do?

I will be working on the following three things-

1. OpenOffice extension: to add a few things to the OOOApertium extension developed by Mr. Miguel Gea Milvaques.

The existing tool is adequate, but can be given an interface lift. A right-click menu item “Translate” can be added which has sub-menus which contain translations in all the languages selected by the user in a configuration window. One can click on one of these translations to replace the selected text by the translated one. One can also choose to translate in a different language for which a side-bar like window (akin to the Animations window in Impress) will open. Here, all the languages for which the language-pair pack has been installed will be listed. Should the user want to translate to a still different language, the Apertium web-service will be used. The configuration menu can also be lifted, with options to search for installed language pairs, and selecting the frequently used languages. Apertium-dbus will be used to call Apertium for translating the text.

2. A Pidgin plug-in, which enables people who do not know a common language to chat with each other.

During a conversation on Pidgin (on any protocol), when both partied enable this plug-in and specify their language and the target language, this Apertium-based plug-in will translate the text from the sender from his source language to the target language and sent to the receiver. There can be an option to completely remove the original text and only have the translated ones, in which case the conversation will appear to either party as solely being in their native language.

3. A Firefox/Thunderbird plug-in, to translate an entire page or a selection using the Apertium web service.

The user can select the source and target language from a menu in the status bar, after that when he performs a right click on a selection of text, the translated text will appear first on the right click menu, should he click on it, the same text will copied to the clipboard. There will be a second option in the right click menu (and also in Tools/Apertium_plugin_name) to translate the entire html page to a different language. A local html page will be generated, with selected fields in the html code translated using the Apertium webservice. This new page will be functional, for which all the relative url will be changed to absolute. The idea is to translate as much as possible without rendering the page non-functional.

Proposal

Title: Interfaces between Apertium and popular open-source software: OpenOffice, Pidgin, Firefox and Thunderbird.

Why Google and Apertium should sponsor it?

The tasks that I intend to do might not have an impact on the core Apertium development (that is apertium, lttoolbox or even one of the language pairs) as they will reside at the interface level, but it will increase the number of people who will use and benefit from the Apertium project. I am targeting four very commonly used open source applications in OpenOffice, Pidgin, Firefox and Thunderbird and these extensions will increase the usefulness significantly. I hope that this will also help in attracting more developers to work at the core level and addition of more language pairs.

How and who it will benefit in society?

One of the many features that OpenOffice lacks but Microsoft Office has is the translation functionality. Adding this functionality will perhaps go some way towards making it equally feature-rich. Since OpenOffice is the most common Office suite for the free software community, the improvement will benefit a lot of people, especially those who work in linguistically rich environment.

The Pidgin plug-in will be the most important one in all the three as it will help people communicate irrespective of the languages they know. A lot of people frequently feel the need to this, for example developers from different places working on the same open source project. As mentioned earlier, this is one of the ways in which the linguistic gap between people can be bridged.

The Firfox/Thunderbird plug-in, might not be a fresh idea, as a lot of translation tools are available but considering the fact that Apertium is more of a platform that an individual system, and since more languages are always being added, an Apertium based plug-in used for translating web/email content will be pretty useful for everyone who uses Firfox for browsing and Thunderbird for managing email.

Work plan


     Week 1:Get familiar with the OpenOffice SDK, understand the OOOApertium code and Apertium-dbus, plan and design the extension
   
     Week 2: Code the extra things in the menu "Apertium", code the configuration sub-menu (that displays the language pairs installed, let you install new ones, set the frequently used languages and other settings) and the right click menu. Implement the basic functionality.
   
     Week 3: Create the sidebar interface for extra options, and for using the Apertium web service
   
     Week 4: Refine the work done on the extension, package it neatly and perform testing on different systems, create documentation.


     Deliverable #1
         o The OpenOffice extension


     Week 5: Get familiar with Pidgin-devel, plan and design the plugin, create a mockup, begin the coding part.
   
     Week 6: Finish coding the plugin
   
     Week 7: Refine the work done on the plugin, package it neatly and perform testing on different systems, create documentation.
   
     Week 8: Get familiar with Firefox,Thunderbird extension building (XUL, XPCOM), create the initial design


     Deliverable #2
         o The OpenOffice extension, Pidgin plugin


     Week 9: Start coding the Firefox Extension
   
     Week 10: Finish the Firefox extension, Code the Thunderbird Extension
   
     Week 11: Refine the work done on the extensions, package them neatly and perform testing on different systems, create documentation.


     Week 12: Review the work done till now, do further testing, resolve any apparent bugs, formalize the documentation.


     Project completed
         o The OpenOffice extension, Pidgin plugin, Firefox & Thunderbird extensions
         o Publish the work, gauge the community feedback, plan for future work.


About Me:

I am currently in the III year of my four year integrated undergraduate course, MSc(tech) Information Systems, at BITS Pilani, Goa Campus (India). I am among the top 5% of my class based on the CGPA after the last semester. I am comfortable with coding in C++ and Java, and am decently acquainted with bash scripting. Apart from these I am open to learning any new things that might be necessary during the course of my project. My university will close down for the summer break from 15th May 2009 to 5th August 2009, during which I have no pending academic work. I will be doing some extra-curricular work under one of my professors which will not take a lot of time, and I'll be glad to commit more than 30 hours per week, which in my opinion, will be required to meet the deadlines set by me, from 6th August 2009 to 3rd September 2009 I will take time out of my academic exertions (which will not be much because that will be my penultimate semester, and I will be doing no more than four courses), to finish the later stages of the project.