User:Qareken/GSoC2019Report

From Apertium
Jump to navigation Jump to search

Develop a releasable Uzbek-Qaraqalpaq translation pair


In this [project] I have worked on unreleased language pair of uzb-kaa languages. For this I had to work in (this) and (this) repositories. I have created dictionary with about 30000 words. To do this first of all I have transformed uzbek-russian dictionary which author is Akobirov with words more than 50000 words from djvu to txt format and corrected mistakes, also converted it to latin. Then I started the same work with Baskakov's karakalpak-russian dictionary with more than 20000 words. Also added to database Baskakov's Karakalpak-English dictionary with more than 7000 words. And wrote categories of these karakalpak words database. I have translated more than 30000 words in uzbek words database and added categories where is needed. Then wrote it to this file (link) in form that we need and fixed problems which had appeared in this file.

In the future this work will be done with remaining words. Some words have only categories, some have only translations. And will be fixed problems with selection rule.