Easy dictionary maintenance

From Apertium
Jump to navigation Jump to search

Easy Dictionary Maintenance - Student Reports

This space will report developments in the project. It is also a space to post comments and suggestions.

Original Ideias

http://wiki.apertium.org/wiki/Ideas_for_Google_Summer_of_Code

http://wiki.apertium.org/wiki/Ideas_for_Google_Summer_of_Code/Easy_dictionary_maintenance

Original AlessioJr GSOC2010 Application

http://wiki.apertium.org/wiki/User:Alessiojr/Easy_dictionary_-_Application-GSOC2010

Title: Easy dictionary maintenance

Student: Alessio Miranda Junior
E-mail: alessio@inf.ufpr.br or alessio@alessiojr.com
Msn: msn@juninho.com.br
IRC: AlessioJr
GTalk: alessiojunin@gmail.com

Abstract:

   The idea is to develop a GUI tool to manage Apertium Monolingual and Bilingual XML files with the follow objectives:
   • Create a alternative form to edit dix files with GUI resources.
   • Develop, initially, monolingual dictionaries but keeping the particular format of each file.
   • Minimize the direct manipulation of XML files, providing features that reduce this need.
   • Making use of DixTools to keep code reuse.

Why?

   The number of language pairs in development for Apertium is increasing, and so is the complexity of these pairs.
   This increased complexity has made the job getting more complicated, thus the need for tools for the task is evident. 
   The proposed want to make this management easier and probably will increase the probability of development for new
   language pairs. With better tools, more people will be able to develop language pairs.

How can use?

   I believe that all Apertium society will have direct or indirect benefit. Directly, the developers
   of language-pairs will have their task facilitated. With a good tool to help with the work, to create
   or maintain a language will become easier, and probably it will take less time to get better results.
   Indirectly, the users will have benefits with this better and robust result.

What its the plan?

   We're planing to create a GUI interface with features that facilitate common tasks of a user who wishes
   to manipulate a existing language pair or dictionary. These tasks will also be of great value to new
   users, who have an intuitive tool to start new language pairs.
   DixTools, tool developed for the apertium, currently already solves half the problem, especially the
   fact that a load XML into memory and do the reverse, it returns the XML in a suitable format.
   We believe that the main challenge of this task is to find a way to expand DixTools by adapting the
   existing classes as a persistence layer connected to a framework for GUI applications, supporting an
   integration of elements, providing tools to search, filter, integration and change.
   The application is developed for monolingual dictionaries manipulation, but its architecture will have
   to provide support for future extensions (Web and Collaborative) and bilingual dictionary.

What we will try to use?

Development Paradigm: MVC Paradigm
Program Language: Java/SWING
Persistence: XML (Apertium XML Files)
Framworks: Dixtools, JPA, Swing Application Framework

Timeline/Stages/Milestones?

Week Stage Description
1, 2 Analysis of technology in handling memory To investigate and select an effective way to view and manipulate the XML files of Apertium in memory using Java.
Analysis of the best technologie that complement the functionality of DixTools during manipulation of XML.
Maybe a database integration, trying to use VTD-XML or extend dixTools Classes.
Testing and choosing the best alternative.
2, 3 Development of first prototype Development of an interface that tries to use a core of features like Load, Save, list , search and Filter elements.
Prototype Milestone 1 First version for community analysis. Provides the basic architecture, the interface model, and basic handling.
5, 6 Simple Structures Implementation of Symbols, Alphabet and statistic features.
Need Drawings experiments to create interface to users.
7 Paradigms First implementation of features with paradigms.
Need Drawings experiments to create interface to users.
8 Lemmas First implementation of features with lemmas.
Need Drawings experiments to create interface to users.
Prototype Milestone 2 Version for testing with huge dictionaries and complete edition test with basic features.
9 Paradigms With feedback of the community, adjusting the interface and implementation, and probably adding new features.
10 Lemmas With feedback of the community, adjusting the interface and implementation, and probably adding new features.
11 Pré-Release Security time to improve integration functionalities
Prototype Release Candidate
12 Makeup Fix remain bugs, final adjustments and documentation in Wiki
Final Release