Kurmanji and English/Final report

From Apertium
Jump to navigation Jump to search

This is the report for my 2016 Google Summer of Code project, Kurmanji-English Machine Translation.

What was done[edit]

My project was to improve significantly the preexisting pair, to around release quality. I have worked on adding vocabulary, disambiguation rules in CG, transfer rules and lexical selection.

The vocabulary was added from a number of sources, a few thousand were added from the work of Walther et al in their Kurmanji analyzer and POS tagger.

Adherence to work plan

I have largely followed and met the goals of the work plan, however in adding vocabulary in some cases I face difficulty in meeting the goals, due to lack of digital resources I was required to add translations one by one, using Kurdish-Turkish dictionaries and translating the meanings into English.

Statistics[edit]

Monodix Bidix CG Rules Transfer Paradigms Bilingual Coverage
Before 1433 11421 3 9 83 57%
After 17715 15597 97 23 157 85%

Future work[edit]

The most immediate concern for future work would be adding more transfer rules, in order to improve the quality of translations, and improving the coverage of the bilingual dictionary a bit more, to around 90%.


List of commits[edit]

My commits are listed below, under the folders with the name kmr, which is the ISO code for Kurmanji Kurdish.

https://apertium.projectjj.com/gsoc2016/memduhg.html

Testing the Product[edit]

The translation pair can be installed using the commands below.

svn co https://svn.code.sf.net/p/apertium/svn/incubator/apertium-kmr-eng/
svn co https://svn.code.sf.net/p/apertium/svn/incubator/apertium-eng_feil/
svn co https://svn.code.sf.net/p/apertium/svn/incubator/apertium-kmr/
cd apertium-kmr
./autogen.sh
make
cd ..
cd apertium-eng_feil
./autogen.sh
make
cd ..
cd apertium-kmr-eng
./autogen.sh --with-lang1=../apertium-kmr --with-lang2=../apertium-eng_feil

In the apertium-kmr-eng folder, echoing or cat'ing text with a pipe to

apertium -d . kmr-eng

will output an english translation.

echo "Ez gelek kefxweş im ku min ji bo GSoC kar kir" | apertium -d . kmr-eng
I very happy #be that I #for *GSoC  worked#