Kurmanji and English/Final report
This is the report for my 2016 Google Summer of Code project, Kurmanji-English Machine Translation.
What was done[edit]
My project was to improve significantly the preexisting pair, to around release quality. I have worked on adding vocabulary, disambiguation rules in CG, transfer rules and lexical selection.
The vocabulary was added from a number of sources, a few thousand were added from the work of Walther et al in their Kurmanji analyzer and POS tagger.
- Adherence to work plan
I have largely followed and met the goals of the work plan, however in adding vocabulary in some cases I face difficulty in meeting the goals, due to lack of digital resources I was required to add translations one by one, using Kurdish-Turkish dictionaries and translating the meanings into English.
Statistics[edit]
Monodix | Bidix | CG Rules | Transfer | Paradigms | Bilingual Coverage | |
---|---|---|---|---|---|---|
Before | 1433 | 11421 | 3 | 9 | 83 | 57% |
After | 17715 | 15597 | 97 | 23 | 157 | 85% |
Future work[edit]
The most immediate concern for future work would be adding more transfer rules, in order to improve the quality of translations, and improving the coverage of the bilingual dictionary a bit more, to around 90%.
List of commits[edit]
My commits are listed below, under the folders with the name kmr, which is the ISO code for Kurmanji Kurdish.
https://apertium.projectjj.com/gsoc2016/memduhg.html
Testing the Product[edit]
The translation pair can be installed using the commands below.
svn co https://svn.code.sf.net/p/apertium/svn/incubator/apertium-kmr-eng/ svn co https://svn.code.sf.net/p/apertium/svn/incubator/apertium-eng_feil/ svn co https://svn.code.sf.net/p/apertium/svn/incubator/apertium-kmr/ cd apertium-kmr ./autogen.sh make cd .. cd apertium-eng_feil ./autogen.sh make cd .. cd apertium-kmr-eng ./autogen.sh --with-lang1=../apertium-kmr --with-lang2=../apertium-eng_feil
In the apertium-kmr-eng folder, echoing or cat'ing text with a pipe to
apertium -d . kmr-eng
will output an english translation.
echo "Ez gelek kefxweş im ku min ji bo GSoC kar kir" | apertium -d . kmr-eng I very happy #be that I #for *GSoC worked#