Difference between revisions of "User:Ote/proposal"
(overview) |
|||
(3 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
Open Translation Engine (OTE) Version 2.0 |
Open Translation Engine (OTE) Version 2.0 |
||
+ | The goal of the OTE project is to solve two problems found today within the open source translation community: |
||
− | Proposal for NLnet[http://www.nlnet.nl/news/2007/20071003-call-en.html] |
||
+ | == Crowdsourcing == |
||
− | = Abstract = |
||
− | + | How can a large and divergent group of users easily build and manage translation dictionaries online? |
|
+ | The OTE will provide a suite of web-based tools to bring people together in the various tasks needed to build a workable translation dictionary. |
||
+ | == Dictionary Unification == |
||
− | While there are many Open Source projects in the field of Machine Translation, there is a lack of tools for the community creation of translation dictionaries. The goal of the Open Translation Engine (OTE)[http://ote.2meta.com] Version 2.0 is to create a robust tool for this space. |
||
+ | How can we unify the numerous open content translation dictionaries available online? The OTE will support the import and export of many formats currently used by active open content/open source translation projects. Special attention will be paid to Apertium and Linguaphile (machine translation projects) and Wiktionary and Omegawiki (Wiki-based dictionaries) |
||
− | |||
− | = Use Cases = |
||
− | |||
− | How will the OTE be used? |
||
− | |||
− | == By the Linguist == |
||
− | |||
− | In the field of Machine Translation, I have chosen the Apertium project as the first external format to support. |
||
− | |||
− | Apertium is a robust project with many active particpants. But there is a lack of tools to allow multiple developers to create the translation dictionaries needed to run Apertium. |
||
− | |||
− | Currently Apertium dictionaries are modified by direct editing of multiple XML files. While these XML files are versioned with Subversion, this is not an ideal solution for true community involvement. |
||
− | |||
− | The OTE will be enhanced to first support tagging of Parts of Speech (Noun, Verb, Adjective, etc). This is an important first step in order to support the second step: Dictionary Export in the proper Apertium XML formats. |
||
− | |||
− | At the end of this project, I will produce a basic workable system that integrates with Apertium. A proposal for OTE Version 3.0 will then be submitted for further enhancements (Dictionary Import, support for Transfer Rules and Apertium Paradigms) |
||
− | |||
− | == By the Student == |
||
− | |||
− | I started the OTE project to help me learn the Dutch language. .... |
||
− | |||
− | == By the Developer == |
||
− | |||
− | A primary goal of this project is to create tools that are easily useable with other Open Source projects. |
||
− | |||
− | = License = |
||
− | |||
− | Currently the OTE is under the BSD License. This includes both the source code and the translation dictionaries. |
||
− | |||
− | During this project, I will re-evaluate which Open Source license is the best choice, with particular attention to possible differences in needs between the source code and translation dictionaries. |
||
− | |||
− | |||
− | = Project = |
||
− | |||
− | == Core System == |
||
− | |||
− | Conversion to Unicode - The current prototype is not unicode aware. Unicode is an absolute necessity to continue the project with all possible word languages. |
||
− | |||
− | Genericization of code base - The OTE will be built to be as generic as possible, thus allowing for ease of future enhancements. Currently the prototype code is hard coded with the Dutch/English language pair. |
||
− | |||
− | Install Procedure - Installation will be made as user-friendy as possible. |
||
− | |||
− | Documentation |
||
− | |||
− | == User System == |
||
− | |||
− | User accounts |
||
− | |||
− | User Permissions |
||
− | |||
− | User administration |
||
− | |||
− | == Core Tools == |
||
− | |||
− | === Word Viewer === |
||
− | |||
− | === Dictionary Viewer === |
||
− | |||
− | === Dictionary Export === |
||
− | |||
− | Formats: CSV, database(MySQL) dump, OTE XML, Apertium XML |
||
− | |||
− | === Dictionary Import === |
||
− | |||
− | Formats: CSV, OTE XML |
||
− | |||
− | === Word-2-Word Translator === |
||
− | |||
− | === Classroom Tools === |
||
− | Random Word, Flash Cards, Word Lists |
||
− | |||
− | == Community Tools == |
||
− | |||
− | Add / Delete / Modify: Individual Word |
||
− | |||
− | Add / Delete / Modify: Tagging: Parts of Speech for a Word |
||
− | |||
− | Add / Delete / Modify: Translation Word Pairs |
||
− | |||
− | Add / Delete / Modify: Languages |
||
− | |||
− | Versioning - all words, translation pairs, and tags are versioned, allowing reversion to previous states. |
||
− | |||
− | |||
− | = Future Work = |
||
− | |||
− | Upon successful completion of this project, I will submit a new proposal for OTE Version 3.0. |
||
− | |||
− | Possible items: |
||
− | |||
− | Further Apertium support |
||
− | |||
− | More Robust Classroom management tools |
||
− | |||
− | Support for further public installations of OTE |
||
− | |||
− | Integration with more projects... More export formats |
||
− | |||
− | = Misc.... = |
||
− | |||
− | Comparison of OTE to current 'translation memory' systems, such as the many gettext/.po file editor / aggregators. |
Latest revision as of 19:40, 29 November 2007
Open Translation Engine (OTE) Version 2.0
The goal of the OTE project is to solve two problems found today within the open source translation community:
Crowdsourcing[edit]
How can a large and divergent group of users easily build and manage translation dictionaries online? The OTE will provide a suite of web-based tools to bring people together in the various tasks needed to build a workable translation dictionary.
Dictionary Unification[edit]
How can we unify the numerous open content translation dictionaries available online? The OTE will support the import and export of many formats currently used by active open content/open source translation projects. Special attention will be paid to Apertium and Linguaphile (machine translation projects) and Wiktionary and Omegawiki (Wiki-based dictionaries)