Difference between revisions of "User:Darthxaher/Application2010"

From Apertium
Jump to navigation Jump to search
(Why sponsor)
Line 41: Line 41:
   
 
'''7 Why should Google and Apertium Sponsor it? '''
 
'''7 Why should Google and Apertium Sponsor it? '''
  +
  +
The existing architecture of Apertium is very robust and fast, but it should be faster.
   
 
'''8 How and who will it benefit in society? '''
 
'''8 How and who will it benefit in society? '''

Revision as of 04:51, 30 March 2010

Google Summer of Code Application 2009
Abu Zaher Md. Faridee
Department of Computer Science and Engineering
Bangladesh University of Engineering and Technology


1 Name

Abu Zaher Md. Faridee

2 Email Address

zaher14@gmail.com

3 Contact Information

IRC: darthxaher@irc.freenode.net

Cell Phone: +880 1714070147

4 Why is it you are interested in machine translation?

As a student of Computer Science, I'm personally very interested in fields of Artificial Intelligence, Machine Learning and Pattern Recognition. I think machine translation is one of the most exiting applications in this field. The most interesting thing about Machine Translation is how fundamentally different the various MT techniques are. Whereas rule bases machine translation relies upon extensively on automata theory and pattern matching, Statistical machine translation approach harnesses the essence of statistics and information theory. There have been extensive work in this field in the recent decade and there is much to be done.

Working on machine translation also involves the unique bonus of getting to know a lot of different languages and cultures, which is its own reward.

5 Why is it you are interested in the Apertium Project?

I successfully completed my last Google Summer of Code project (2009) titled 'Conversion of Anubadok: Creating an English Bengali Language Pair' under Apertium. The project was a great experience for me. I had the wonderful experience of working with some of the experts in rule based machine translation technique. Though quite interested in working in this field, my knowledge on machine translation was not that much great. But during the course of the project I got the chance to understand the intricate things of RBMT through my mentor and Apertium's helpful community. It goes without saying that Apertium's community is one of the most active open source communities out there and here I really feel at home.

I have been long time supporter of the open source movement in my country. Adopting to open source philosophy is crucial for a developing country like Bangladesh where cost of proprietary software is unbearable for the most people. Open source machine translation that is being offered by Apertium will have far reaching effect in the local Bengali Language adoption and localization of open source softwares.

6 Which of the published tasks are you interested in? What do you plan to do?

I'm interested in 'VM for the transfer module' idea, that is creating a virtual machine for the transfer stage in Apertium's pipeline.

As already mentioned in the idea's page, Apertium currently uses XML tree walking in the transfer stage, the stage in which Apertium brings forth the structural changes in the sentences. This is quite inefficient as XML parsing is quite time consuming. The idea is to create a pseudo-assembly level mini instructions that embodies the rules stated in the XML files (t1x, t2x. T3x), then compile them to a easy to use byte-code format. A tiny and highly optimized Virtual Machine would need to be written to run the byte-code. Even a non JIT optimized VM could achieve several magnitude of performance over existing XML based solution.

7 Why should Google and Apertium Sponsor it?

The existing architecture of Apertium is very robust and fast, but it should be faster.

8 How and who will it benefit in society?

9 Work Plan