User:Qareken/GSoC2019Report

From Apertium
Jump to navigation Jump to search

Develop a releasable Uzbek-Qaraqalpaq translation pair

In this project I have worked on unreleased language pair of uzb-kaa languages. For this I had to work in this and this repositories. I have created dictionary with about 30000 words.

What is done

First of all I have transformed uzbek-russian dictionary which author is Akobirov with words more than 50000 words from djvu to txt format and corrected mistakes, also converted it to latin. Then I started the same work with Baskakov's karakalpak-russian dictionary with more than 20000 words. Also added to database Baskakov's Karakalpak-English dictionary with more than 7000 words. And wrote categories of these karakalpak words database. I have translated more than 30000 words in uzbek words database and added categories where is needed. Then wrote it to this file in form that we need and fixed problems which had appeared in this file.

What should be done in the future

In the future this work will be done with remaining words. Some words have only categories, some have only translations. And will be fixed problems with selection rule.