User:AMR-KELEG/GSoC19 Proposal

From Apertium
Jump to navigation Jump to search

Personal Information

  • Name: Amr Keleg
  • E-mail address: amr.keleg@eng.asu.edu.eg / amr_mohamed@live.com
  • IRC: AMR-KELEG
  • Location: Cairo, Egypt
  • Timezone: UTC+02
  • Current job: A MSc student and a teacher assistant at Computer and systems department, Faculty of Engineering, Ain Shams university, Cairo, Egypt.

Qualifications

  • I graduated as the first of my class of 138 students (Computer and systems department, Faculty of Engineering, Ain Shams University).
  • I have successfully participated as a student in GSoC 2016 as part of the GNU Octave organisation.
  • I have worked for one year as a full-time machine learning engineer. My role was developing sentiment analysis model for Arabic language.
  • As a student, I have participated in online (Google codejam)and on-site (ACM Collegiate programming contest) competitive programming contests.

Throughout those participations, I solved more than 700 problems on different online judges.

  • I am interested in open source communities and have made several contributions to open source projects (cltk - gensim - asciinema - octave and apertium).
  • I have Completed Udacity's data analysis nanodegree. Throughout those courses, I had to use python to perform analysis on different data-sets.

Skills

  • Experience in coding with C++ and python.
  • Good command of git and the GitHub process of contribution.
  • Usage of Ubuntu as the main OS for more than 3 years.
  • Basic knowledge of shell scripting.

Coding challenge

Code repository: https://github.com/AMR-KELEG/apertium-unsupervised-weighting-of-automata

Project Information

Why is it that you are interested in Apertium?

Which of the published tasks are you interested in? What do you plan to do?

Include a proposal, including

   * a title,
   * reasons why Google and Apertium should sponsor it,
   * a description of how and who it will benefit in society,
   * and a detailed work plan (including, if possible, a schedule with milestones and deliverables).

Work Plan

Community Bonding Communicate with the maintainers and get to know Apertium better.

Solve some issues on Github.

Week 1

(27 May - 3 June)

Implement a baseline model for weigthing automata.
Week 2

(4 June - 10 June)

Develop the first supervised model (Unigram counts).

Write a shell script for generating weights using a tagged corpus.

Week 3

(11 June - 17 June)

Read, Understand and plan for implementing the publication for the first unsupervised model.
Week 4

(18 June - 24 June)

Finalise the first unsupervised model and compare it to the supervised one.
Evaluation 1

Deliverables: Two shell scripts for generating weights using both supervised and unsupervised techniques.

Week 5

(29 June - 5 July)

Read, Understand and plan for implementing the publication for the second unsupervised model.
Week 6

(6 July - 12 July)

Implement the second unsupervised model.
Week 7

(13 July - 22 July)

Read, Understand and plan for implementing the publication for the second unsupervised model.
Week 8

(23 July - 12 July)

Implement the second unsupervised model.
Evaluation 2

Deliverables: A shell script for using the second unsupervised model and a plan for implementing the third one.

Week 9

(27 July - 2 August)

Implement the third unsupervised model.
Week 10

(3 August - 9 August)

Solve issues related to the developed models.
Week 11-12

(10 August - 26 August)

Write the required documentation and merge the code into Apertium's repositories.
Final evaluation