User:Shubham1011/proposal

From Apertium
Jump to navigation Jump to search

Apertium GSoC 2019

Python API(APy) for Apertium
[edit]

Contact Information[edit]

Name: Shubham Dikshit
E-mail address: iamsds123@gmail.com
Mobile Number: +91 9773880604 (India)
Github: shubham10111
Timezone: UTC +5.30

Why is it that you are interested in Apertium?[edit]

I belong to a country where 720 dialects used by a population of 1.3 billion. Being from a nation with vast languages and as a student of Computer Science, I strive to solve the problems in language translation through my knowledge of programming and natural language processing. I like the concept of Apertium as an open source language translator as it solves the problem of language translation with ease. As I am a student of Computer Science, I have a keen interest in programming and development. I am proficient in web development and want to apply my knowledge of machine translation by contributing in Apertium. I have an urge to improve open source language translation with Apertium. One of the greatest features is the ease of adaption of a new language pair. In my opinion, it is an extremely important feature of this project and I also like the idea of general rules for closely related languages.

Which of the published tasks are you interested in? What do you plan to do?[edit]

I am planning to work on the either or both of the listed projects:

  1. Python API/library for Apertium
  2. Improvements to the Apertium website

Why Google and Apertium should sponsor it?[edit]

I have a proper knowledge and experience with development in Python, C , Html , Css and JavaScript. Moreover I would like to work projects including areas like web development, opencv, machine learning and natural language processing. The libraries of apertium are written in C++ which is low level language and i am planning to learn C++ as my next language proficiency . As people prefer high level languages like python, C++ packages need to be made available by writing APIs for python using SWIG. SWIG allows C++ libraries to be used flexibly with scripting languages such as python.

How and who it will benefit in society?[edit]

As most people use Windows and Mac adding support for windows and Mac will increase the user base of Apertium. A pip install for windows and mac would make development process a lot easier for the developers. Adding certain capabilities of Apertium such as dictionary/synonym lookup and webpage translation to the website would help in increasing the users and expand the uses of Apertium.

Work Plan[edit]

Broad Plan[edit]

  1. To inspect html,css and javascript.
  2. To inspect the python script.
  3. To acknowledge the problem statement and start working on a possible solution with focus on each detail.
  4. Working on apertium-apy and lttoolbox and making it available on windows.
  5. Working on improvements to apertium website and adding functionalities.
  6. Wrapping up the project with proper documentation and project report.

Detailed plan[edit]

WEEK DESCRIPTION DELIVERABLE
Week 1
  • Experimenting with apertium modules for windows.
  • Developing the understanding of apertium-python package and its wrapper functions.
  • Trying all the functions of Ittoolbox and writing some sample code.
  • Discussing detailed work flow with mentors.
Report for next week’s work plan.
Week 2
  • This includes setting the core modules of apertium on windows using apertium-python package on github.
  • Completing the installation script for windows in python.
  • Fixing bugs stopping apertium to install on windows.
An installable prototype for installing apertium on windows.
Week 3
  • Inspecting the apertium installation and its usage for Jupyter notebook and other such platforms.
  • Checking the working of language processing tools and its usability in these environments.
Complete installation of apertium for windows using pip
Week 4
  • Working on creating a SWIG API for lttoolbox.
  • Adding other functionalities as guided by mentors.
A working transducer function using SWIG to convert C++ functions in python script.
Week 5
  • Improving SWIG API with transducer function using python.
  • Fixing the bugs in lttoolbox to make it ready for release.
  • Evaluation by mentor.
Releasing the SWIG API for lttoolbox.
Week 6
  • Improvement to apertium website by adding by adding dictionary lookup mode for single word translation that would give synonyms for the translations.
  • Ranking the synonyms in order of their likelihood.
  • Improving some of the already existing code.
Dictionary lookup functionality on the website.
Week 7
  • Coloring the resulting translation depending on how reliable it is.
  • Fixing bugs and working on release.
Reliability visualization ready for release.
Week 8
  • Making language detection work in proper manner.
  • Adding did you mean suggestions on the website if someone chooses unlikely language.
Functional language detection and did you mean feature.
Week 9
  • Adding all the functionalities to the website.
  • Testing the proper working of all feature.
  • Debugging the problems.
  • Evaluation by mentor.
Improved Apertium website ready for deployment.
Week 10
  • Debugging apertium-apy by solving new and old issues.
  • Preparing documentation for apertium-apy.
Complete apertium-apy with documentation.
Week 11
  • Debugging lttoolbox.
  • Creating documentation for lttoobox.
Complete lttoolbox implementation
Week 12
  • Fixing bugs in the apertium website.
  • Adding improvements to the apertium-html-tools.
Making website release ready.
Week 13
  • Releasing the apertium website with changes.
  • Testing the released website and debugging the issues.
  • Evaluation by mentor.
Release Apertium website for users.
Week 14
  • Releasing the final productions of all the work
  • Completing all the changes suggested by the mentors
  • Cleaning up the documentation.
  • Reporting unresolved bugs.
Release final production.

Education[edit]

I am pursuing my degree of Bachelor of Technology in Computer Science Engineering at IMS Engineering College, India. I am a student of first year enrolled in a four year course of CSE.

Experience[edit]

I have been studying Computer Science for a year now, with gaining experience in programming languages such as C, Python, Javascript,html,css. I am trying to be proficient in competitive programming and algorithm development. I have been contributing to open source for quite a while now in Python and C. Here is my github timeline :

Git.png

Non-summer of code plans[edit]

Google Summer of Code lies during the summer vacations of the college and I will be doing summer internship at IIIT Hyderabad on Natural Language and Processing . I will try to devote 40 hours per week and more if necessary to Apertium during summer of code and try to manage my work accordingly . I will try to handle both GSOC and my internship accordingly and will give priority to both.