Difference between revisions of "User:Jonpan4/gsoc2020proposal"

From Apertium
Jump to navigation Jump to search
Line 78: Line 78:
|
|
* Morphology filling from API (using APy for tokenisation and analysis of sentences)
* Morphology filling from API (using APy for tokenisation and analysis of sentences)
* Interface for disambiguation
* '''Second evaluation''' (July 27 - July 31)
* '''Second evaluation''' (July 27 - July 31)
|-
|-

Revision as of 00:05, 30 March 2020

Contact Information

Name: Jonathan Pan

Location: California

E-mail: jonpan4@gmail.com

IRC: jjjppp

GitHub: JPJPJPOPOP

Why is it that you are interested in Apertium?

I participated in GCI and helped mentor with Apertium, and I have worked a little on UD Annotatrix before and wanted to do a more in-depth project with it.

Which of the published tasks are you interested in? What do you plan to do?

Improvements to UD Annotatrix. The work plan containing the tasks that I plan to do is below.

Reasons why Google and Apertium should sponsor it and how and who it will benefit in society

Universal Dependencies (UD) is a framework for consistent annotation of grammar across many languages through a system of dependency trees. UD Annotatrix helps provide a tool for creating and editing these dependency trees based on the guidelines set by UD. One part of my project will allow for easier visualisations of graphs. There have been difficult style issues that needed addressing, and by switching to a d3JS-based editor, these style changes can be more easily implemented. I will also implement/put finishing touches on new features in order to make editing more streamlined.

Skills and Qualifications

I am a freshman majoring in computer science at UC Berkeley. Coding experience includes:

  • HTML
  • CSS
  • Javascript, d3JS, ReactJS
  • Python

Coding challenge for creating a graphing editor using d3JS: https://github.com/JPJPJPOPOP/d3-graph

Workplan

Week Goals
Week 1 (June 1 - June 7)
  • Building new graph editor (part 1)
  • Adding the necessary visual features like POS labels and token numbers.
  • Adding zooming/panning.
  • Add ability to split and merge tokens.
Week 2 (June 8 - June 14)
  • Building new graph editor (part 2)
  • Fixing styling issues such as staggering the vertical alignment of deprels or shifting tokens to account for deprel labels
  • Merging it with the current UD Annotatrix codebase
Week 3 (June 15 - June 21)
  • Fixing vertical alignment view (it is currently very messed up in the current version of UD Annotatrix).
Week 4 (June 22 - June 28)
  • Increase support for enhanced dependencies
Week 5 (June 29 - July 5)
  • Active learning for labels
  • First evaluation (June 29 - July 3)
Week 6 (July 6 - July 12)
  • Finishing up Github integration
Week 7 (July 13 - July 19)
  • Better support for editing via textbox
Week 8 (July 20 - July 26)
  • Better key support (either globalizing all current key bindings or creating a way to tell the current scope of keybinds)
  • Undo/redo history fixes
Week 9 (July 27 - August 2)
  • Morphology filling from API (using APy for tokenisation and analysis of sentences)
  • Interface for disambiguation
  • Second evaluation (July 27 - July 31)
Week 10 (August 3 - August 9)
Week 11 (August 10 - August 16)
Week 12 (August 17 - August 24)

Other Commitments

I am probably going to take an online literature class during the summer to fulfill a college requirement (it will take up about 5 hours a week), but I will definitely have at least 30 free hours a week.