Difference between revisions of "User:Aditya"

From Apertium
Jump to navigation Jump to search
Line 9: Line 9:
 
== Why is it that you are interested in Apertium? ==
 
== Why is it that you are interested in Apertium? ==
   
 
<p> I find myself immensely interested in the concepts of Web Development and NLP and as apertium is moving forward to improve its website i am even more excited to work on this project. As a enthusiastic Web Developer i find apertium to be a great step to my propensity which may come to realisation</p>
<p>
 
 
<p> Moreover, many existing machine translation systems are available at present which are mostly commercial or use proprietary technologies, which makes them very hard to adapt to new usages; furthermore, they use different technologies across language pairs, which makes it very difficult.But Apertium uses single uses a language-independent specification, to allow for the ease of contributing to Apertium, more efficient development, and enhancing the project's overall growth.
As an organisation Apertium is currently working in areas that enthrals me like
 
</p>
 
 
 
<p> &nbsp; &nbsp; &nbsp; &nbsp;I find myself immensely enjoing in the concepts of Web Development and NLP and as apertium is moving forward to improve its website i am even more amused to work in this very project, having said that i really wish to be a part that makes changes in the aspects of website improvements </p>
 
<p> &nbsp; &nbsp; &nbsp; &nbsp; Moreover, many existing machine translation systems are available at present which are mostly commercial or use proprietary technologies, which makes them very hard to adapt to new usages; furthermore, they use different technologies across language pairs, which makes it very difficult.But Apertium uses single uses a language-independent specification, to allow for the ease of contributing to Apertium, more efficient development, and enhancing the project's overall growth.
 
 
</p>
 
</p>
 
== Which of the published tasks are you interested in? what do you plan to do? ==
 
== Which of the published tasks are you interested in? what do you plan to do? ==

Revision as of 12:12, 27 March 2018

Contact Information

Name: Aditya
E-mail address: adityaprayaga@gmail.com
Location: Hyderabad, India.
Phone Number: +91-8106805681
GitHub: https://github.com/aditya-369
IRC: aditya_369
Languages: Telugu(native), English, Hindi

Why is it that you are interested in Apertium?

I find myself immensely interested in the concepts of Web Development and NLP and as apertium is moving forward to improve its website i am even more excited to work on this project. As a enthusiastic Web Developer i find apertium to be a great step to my propensity which may come to realisation

Moreover, many existing machine translation systems are available at present which are mostly commercial or use proprietary technologies, which makes them very hard to adapt to new usages; furthermore, they use different technologies across language pairs, which makes it very difficult.But Apertium uses single uses a language-independent specification, to allow for the ease of contributing to Apertium, more efficient development, and enhancing the project's overall growth.

Which of the published tasks are you interested in? what do you plan to do?

I am excited to work on the project Improving Apertium website which is listed as 1.8 idea under the published list of ideas.

Work Flow Details

Apertium has a pretty cool website already but,this is an opportunity that helps me in improving Apertium website and make it cooler than it is of now.So Coming to work flow details it is basically split into three phases as number of tasks listed in are also three i.e,

  • Phase-1: Dictionary Look-Up Mode for single-word translations
  • Phase-2: Colour Grading according to the reliability of translation
  • Phase-3: Make language detection work properly and Suggestion styling if people choose an unlikely source language

    Phase-1

    Dictionary Look-Up Mode for single-word translations

    According to the task set of dictionary look-up for single-word translations, there needs to be a back-end module developed in python that would take language pairs and word as input parameters and these parameter are useful in morphological analyser phase as this phase segments the word and looks up in language dictionaries. Also for frontend part there needs to be a Check Box or enable button to enable the dictionary Look-Up Mode.I also referred to previous ideas in [1].

  • Dictionary Look-Up Mode should give synonyms and alternative translations

    Adding Check Box or enable button to enable the dictionary Look-Up Mode which will give a division under target language division to display the synonyms and alternate translations,So there also back end module for this written in python so as the look up for similar meaning words would be fast and accurate,to accomplish this we nee to maintain links between similar words or lists in python is also another alternative.

  • Dictionary Look-Up Mode should rank the translations by likelihood

    As above

    • Front-End:
    1. A check-Box or enable button under "Instant translation check box" if checked or enabled that allows to see synonyms to for translated word.These synonyms would be showed in div under the space in destination Language look-up.

    1. Add probability bars to each alternative translation.These probability bars(visualisation) are obtained using matplotlib library in python and the probabilities show the likelihood of translation.

    • Back-End:
    1. Conditions are written about what would enable/disable(1/0) dictionary mode and for each value separate code is written so that the module gives the synonyms and alternate translations.

    1. As above alternate translations appear they are ranked according to the ranking methods and will be visualised.Any ranking methods is used among find out the probability.Below link has ranking methods[2]

    .


    Phase-2

    Colour Grading according to the reliability of translation

  • Colour Grading according to the reliability:

    Colour Grading according to the reliability of translation is not as simple as it looks. There needs to be proper reliability function used for example WER(word-Error Rate) which helps in calculating probability so that this value helps in Colour grading in front-end development

    • Front-End:'
    1. I would like to display the more reliable translation using green and decrease its brightness accordingly

    2. Updating existing BootStrap of Apertium Website to Version-4,Solving some more issues on apertium-html-tools like apertium cannot translate on git

    • Back-End:
    1. WER Explanation:

    Definitions:

  • S is the number of substitutions,
  • D is the number of deletions,
  • I is the number of insertions,
  • C is the number of the corrects,
  • N is the number of words in the reference (N=S+D+C)
    This is used to build the reliability function and from this we can get the number and use that in front-end to label by colour.

    Note:As this is a 4-week also would like to update existing bootstrap to bootstrap-v4 if time permits.

    Phase-3

    Make language detection work properly and Suggestion styling if people choose an unlikely source language

  • Styling(did you mean) when people choose an unlikely source language:

    When Users Knowingly or Unknowingly choose unlike source language the this module would give a alert box or indication message which states the original source or expected source language.

  • Make language detection work properly:

    Language detection is an existing feature in apertium but by adding the above "did you mean" feature user would Know what is correct Language

    • Front-End:
    1. If user choose unlikely source language then a "red alert" or dialog box appears having "did you mean this" source language.

    • Back-End: Here in Back-End there are two modules to be developed
    1. A module for integrating with front-End styling of "did you mean" is developed by checking alphabets matching Language pair sequentially

    2. A module developed in python that would make language detection work properly by increasing the accuracy of detection of language by optimising search patterns.This module will take the word given by user and verifies syntax or format over languages by /identifyLang function already in existing apertium-apy

    Work plan

    Phase-1,Weeks 1-4

    • Week 1: Front-End module for Dictionary Look-Up Mode with synonyms and alternative translations.
    • Week 2: Back-End module for Dictionary Look-Up Mode with synonyms and alternative translations.
    • Week 3: Back-End module for Dictionary Look-Up Mode ranks the translations by likelihood.
    • Week 4: Front-End module for Dictionary Look-Up Mode ranks the translations by likelihood
    • Deliverable #1: June 11-13

    Phase-2,Weeks 5-8

    • Week 5: Back-End module for Colour Grading according to the reliability function(Word-Error Rate(WER)).
    • Week 6: Back-End module for Colour Grading according to the reliability function(Word-Error Rate(WER)).
    • Week 7: Front-End module for Colour Grading according to the reliability
    • Week 8: Update Bootstrap to Bootstrap V-4 and also resolve issues in Apertium-apy,Apertium-html-tools
    • Deliverable #2: July 9-11

    Phase-3,Weeks 9-12

    • Week 9: Back-End module for styling("did you mean") if people choose an unlikely source language.
    • Week 10: Front-End module for styling("did you mean") if people choose an unlikely source language.
    • Week 11: Study Various methods to increase accuracy get the apt method and start building model.
    • Week 12: Back-End module for making language detection work properly by increasing the accuracy of detection.
    • Final Delivery: August 6 - 14
    • Project completed

      Apertium Website will finally,
      + has "dictionary lookup" mode for single-word translations
      + gives synonyms and alternative translations
      + ranks the translations by likelihood
      + has mode that colours the resulting translation depending on how reliable it is
      + makes language detection work properly
      + does "did you mean" style if people choose an unlikely source language.


    Above Week 5,Week 11 Would not be completely coding instead a specific time period is give so that to learn the required material as to optimise the performance of features in website.

    Education and Skills

    University Courses

  • Design and Analysis of Algorithms
  • Object Oriented Analysis and Design
  • Formal Languages and Automata Theory
  • Compiler Design
  • Data Structures
  • Data Base Management Systems
  • Math(Advanced Calculus,Ordinary Differential Equations and Laplace Transforms,Computational Methods,Probability, Statistics and Queuing Theory )
  • Web Technologies
  • Operating Systems
  • Linux Internals

    Special Courses

    Technical Skills

    Programming languages: C,C++,Python,Javascript, R, Java
    Web Technologies: HTML, CSS,Xml ,Angular, Bootstrap
    App Devlopment: Android Studio
    Frameworks: Django ,ExpressJs
    Databases: Oracle(My-SQL), Mongo-Db,SQlite

    Reasons why Google and Apertium should sponsor Improving of Apertium Website Project

    Apertium has a cool idea of machine translation of languages.But Website is their only gateway for the reflection of idea.So a good User experience would enable a large online crowd to use this site and be helpful in increasing knowledge on their required Language.

    As when societal aspect comes apertium core idea itself is a good advantage to society because where is no barrier for communication in specific language every one can express their thought in regional which is narrowed down by apertium.

    When it come to our project improving apertium website,there are feature like synonyms and also gives alternate traslations,colour grading according to the reliability,etc that would help User experience and intern help the growth of organisation.

    Description of how and who it will benefit in society

    As when societal aspect comes apertium core idea itself is a good advantage to society because where is no barrier for communication in specific language every one can express their thought in regional which is narrowed down by apertium.

    Non-Summer-of-Code plans you have for the Summer

  • I have my exam schedule during April 19-29.As per Gsoc Schedule results would be out on April-23.So I have 2 exams in-between the schedule although real coding would start at May-14 for which is lot of preparation gap for gsoc coding.

  • So except the for the exam day community bonding session i would work on weekly about 35-40 hrs.

    Coding Challenges

    1. Installed both apertium-html-tools and apertium-apy from their respective git repositories i.e, https://github.com/apertium/apertium-html-tools and https://github.com/apertium/apertium-apy.
    2. Tried To resolve issue #114 in apertium-html-tools(https://github.com/apertium/apertium-html-tools/issues) and sent pull request | #304
    3. The above is about Getting rid of inline styles in index.html.in file in apertium-html-tools file so the approach was basically remove styles in index.html.in file give them ids and refer these ids in external styling sheet, but in the due course of solving this issue I frequently interacted with my mentor Sushain and other organisaton members like Jonathan etc and finally got to a conclusion that apertium is slowly migrating to bootstrap-v4 so found some classes in bootstrap-v4 and placed these in general.css and remaining styles in translation.css.

    4. Started to work on issues related back-end in apertium-apy#61