Difference between revisions of "User:Ilnar.salimzyan/Wishlist"

From Apertium
Jump to navigation Jump to search
Line 5: Line 5:
 
== Pan-turkic-english-russian dictionary ==
 
== Pan-turkic-english-russian dictionary ==
   
Pan-turkic+english+russian dictionary which we maintain (which is a superset of any Turkic-to-Turkic, Turkic-to-English, Turkic-to-Russian bidix and is aldo used for testing translators). Would help with classification of stems (i.e. with being consistent across Turkic pairs). Exporting that into OmegaWiki would be cool as well (although we need much less than what Omegawiki does).
+
Pan-turkic+english+russian dictionary which we maintain (which is a superset of any Turkic-to-Turkic, Turkic-to-English, Turkic-to-Russian bidix and is also used for testing translators). Would help with classification of stems (i.e. with being consistent across Turkic pairs). Exporting that into OmegaWiki would be cool as well (although we need much less than what Omegawiki does).
   
 
I would store it in a format as close to the current bidix format as possible, e.g.:
 
I would store it in a format as close to the current bidix format as possible, e.g.:

Revision as of 14:13, 8 March 2015

Annotatrix

For Annotatrix: Comparing my annotation with annotation done by another user, calculating the inter-annotator agreement and easy merging of the two versions by highlighting sentences which differ.

Pan-turkic-english-russian dictionary

Pan-turkic+english+russian dictionary which we maintain (which is a superset of any Turkic-to-Turkic, Turkic-to-English, Turkic-to-Russian bidix and is also used for testing translators). Would help with classification of stems (i.e. with being consistent across Turkic pairs). Exporting that into OmegaWiki would be cool as well (although we need much less than what Omegawiki does).

I would store it in a format as close to the current bidix format as possible, e.g.:

<e>
  <p>
    <tat>китап<sdef n="n"/></tat>
    <kaz>кітап<sdef n="n"/></kaz>
    <eng>book<sdef n="n"/></eng>
    <rus>книга<sdef n="n"/><sdef n="nn"/><sdef n="f"/></rus>
  </p>
<e>

There are some little tricks in pardefs (which you can't have when using some spreadsheet or similar) useful when translating into/from Russian or English, they would work for any Turkic language.

The main motivation for such a dictionary is that we keep everything in one place so that we can control things like "мүмкін емес". The reason it appeared in kaz.lexc was that eng-kaz.dix contained it. If there is a better way to handle it, we could agree upon one and store it in our pan-turkic dictionary. You know, one single point of leverage for harnessing the influence of non-turkic languages :)