Turkish and Kyrgyz/Final report

From Apertium
Jump to navigation Jump to search

placeholder

Description

The aim of this project was to develop apertium machine translation apertium-tr-ky between Turkish and Kyrgyz languages. It was really challenging and hard at the same time. The translation is not perfect, there are lot of things to improve but it is quite satisfying according to the period of time and work done. By using apertium-tr-ky we have translated several child stories and showed it to native speakers of Kyrgyz and we get very positive reactions from them. Right now apertium-tr-ky is the only MT tool from any language to Kyrgyz so I am sure that it is a great success.

TRmorph

We used TRmorph As a morphological analyzer/generator for Turkish. Even though there are so many things to improve in TRmorph it is quite usable.Thanks to Çağrı for his great work.

kymorph

And we developed new morphological analyzer/generator kymorph for Kyrgyz language from scratch as there is no other. So we can say that kymorph is the only (right now) morphological analyzer/generator for Kyrgyz language.It is developed by HFST. This part was the toughest part because of not having good resources on Kyrgyz language about morphological structure and lexicon database with part of speeches. We achieved coverage of % on SETimes corpora. And i am really happy with kymorph.

bidix

Our bidix entry is almost 7122 entries and it is very nice amount for now. Not having decent digital dictionary from Turkish to Kyrgyz was a big issue.I am planning to revise it later.

Transfer rules

It is really though to come up with certain transfer rules between Turkish and Kyrgyz. Even though we come with some rule which are working very well. Still in my plans to build new rules and revise existing ones.

CG

We use same CG which is used in apertium-tr-az.Obviously it must be developed and revised further. And I'd like to thank #zfe and #spectre for their great work.

Statistics

Dictionaries
  • trmorph lexicon:
  • apertium-tr-ky.tr-ky.dix (unique: , total: ) 7122
  • apertium-tr-ky.ky.lexc
Coverage
  • Turkish Wikipedia ( , std. dev.: )
  • Turkish SETimes ( , std. dev.: )
  • Turkish ... ( , std. dev.: )
Rules
Error rate
File Num. Words % OOV WER (Sur) PER (Sur) WER (Lem) PER (Lem)
setimes.kosova_plate.tr.txt 243 - - - - -
setimes.kosova.tr.txt 424 - - - - -
setimes.bulgar.tr.txt 395 - - - - -
wikipedia.kadinlar_askerler.tr.txt 1165 - - - - -

Future work