Difference between revisions of "User:Ilnar.salimzyan/Wishlist"
Line 5: | Line 5: | ||
== Pan-turkic-english-russian dictionary == |
== Pan-turkic-english-russian dictionary == |
||
Pan-turkic+english+russian dictionary which we maintain (which is a superset of any Turkic-to-Turkic, Turkic-to-English, Turkic-to-Russian bidix and |
Pan-turkic+english+russian dictionary which we maintain (which is a superset of any Turkic-to-Turkic, Turkic-to-English, Turkic-to-Russian bidix and is aldo used for testing translators). Would help with classification of stems (i.e. with being consistent across Turkic pairs). Exporting that into OmegaWiki would be cool as well (although we need much less than what Omegawiki does). |
||
I would store it in a format as close to the current bidix format as possible, e.g.: |
I would store it in a format as close to the current bidix format as possible, e.g.: |
||
Line 19: | Line 19: | ||
</pre> |
</pre> |
||
There are some tricks in pardefs (which you can't have when using some spreadsheet or similar) useful when translating into/from Russian or English, they would work for any Turkic language. |
There are some little tricks in pardefs (which you can't have when using some spreadsheet or similar) useful when translating into/from Russian or English, they would work for any Turkic language. |
||
The main motivation for such a dictionary is that we keep everything in one place so that we can control things like "мүмкін емес". The reason it appeared in kaz.lexc was that eng-kaz.dix contained it. If there is a better way to handle it, we could agree upon one and store it in our pan-turkic dictionary. You know, one single point of leverage for harnessing the influence of non-turkic languages :) |
The main motivation for such a dictionary is that we keep everything in one place so that we can control things like "мүмкін емес". The reason it appeared in kaz.lexc was that eng-kaz.dix contained it. If there is a better way to handle it, we could agree upon one and store it in our pan-turkic dictionary. You know, one single point of leverage for harnessing the influence of non-turkic languages :) |
Revision as of 14:08, 8 March 2015
Annotatrix
For Annotatrix: Comparing my annotation with annotation done by another user, calculating the inter-annotator agreement and easy merging of the two versions by highlighting sentences which differ.
Pan-turkic-english-russian dictionary
Pan-turkic+english+russian dictionary which we maintain (which is a superset of any Turkic-to-Turkic, Turkic-to-English, Turkic-to-Russian bidix and is aldo used for testing translators). Would help with classification of stems (i.e. with being consistent across Turkic pairs). Exporting that into OmegaWiki would be cool as well (although we need much less than what Omegawiki does).
I would store it in a format as close to the current bidix format as possible, e.g.:
<e> <p> <tat>китап<sdef n="n"/></tat> <kaz>кітап<sdef n="n"/></kaz> <eng>book<sdef n="n"/></eng> <rus>книга<sdef n="n"/><sdef n="nn"/><sdef n="f"/></rus> </p> <e>
There are some little tricks in pardefs (which you can't have when using some spreadsheet or similar) useful when translating into/from Russian or English, they would work for any Turkic language.
The main motivation for such a dictionary is that we keep everything in one place so that we can control things like "мүмкін емес". The reason it appeared in kaz.lexc was that eng-kaz.dix contained it. If there is a better way to handle it, we could agree upon one and store it in our pan-turkic dictionary. You know, one single point of leverage for harnessing the influence of non-turkic languages :)