User:Ilnar.salimzyan/Wishlist

From Apertium
Jump to navigation Jump to search

Annotatrix

For Annotatrix: Comparing my annotation with annotation done by another user, calculating the inter-annotator agreement and easy merging of the two versions by highlighting sentences which differ.

Pan-turkic-english-russian dictionary

Pan-turkic+english+russian dictionary which we maintain (which is a superset of any Turkic-to-Turkic, Turkic-to-English, Turkic-to-Russian bidix and also is used for testing translators). Would help with classification of stems as well (i.e. with being consistent across Turkic pairs). Exporting that into OmegaWiki would be cool as well (although we need much less than what Omegawiki does).

I would store it in a format as close to the current bidix format as possible, e.g.:

<e>
  <p>
    <tat>китап<sdef n="n"/></tat>
    <kaz>кітап<sdef n="n"/></kaz>
    <eng>book<sdef n="n"/></eng>
    <rus>книга<sdef n="n"/><sdef n="nn"/><sdef n="f"/></rus>
  </p>
<e>

There are some tricks in pardefs (which you can't have when using some spreadsheet or similar) useful when translating into/from Russian or English, they would work for any Turkic language.

The main motivation for such a dictionary is that we keep everything in one place so that we can control things like "мүмкін емес". The reason it appeared in kaz.lexc was that eng-kaz.dix contained it. If there is a better way to handle it, we could agree upon one and store it in our pan-turkic dictionary. You know, one single point of leverage for harnessing the influence of non-turkic languages :)