Difference between revisions of "Kurmanji and English/Final report"

From Apertium
Jump to navigation Jump to search
 
(8 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{TOCD}}


This is the report for my 2016 Google Summer of Code project, Kurmanji-English Machine Translation.


== What was done ==
== What was done ==
My project was to improve significantly the preexisting pair, to around release quality. I have worked on adding vocabulary, disambiguation rules in CG, transfer rules and lexical selection.

The vocabulary was added from a number of sources, a few thousand were added from the work of Walther et al in their Kurmanji analyzer and POS tagger.


; Adherence to work plan
; Adherence to work plan
I have largely followed and met the goals of the work plan, however in adding vocabulary in some cases I face difficulty in meeting the goals, due to lack of digital resources I was required to add translations one by one, using Kurdish-Turkish dictionaries and translating the meanings into English.


== Statistics ==
== Statistics ==
{| class="wikitable"
!
! Monodix
! Bidix
! CG Rules
! Transfer
! Paradigms
! Bilingual Coverage
|-
! Before
| 1433
| 11421
| 3
| 9
| 83
| 57%
|-
! After
| 17715
| 15597
| 97
| 23
| 157
| 85%
|}


== Future work ==
; Before
The most immediate concern for future work would be adding more transfer rules, in order to improve the quality of translations, and improving the coverage of the bilingual dictionary a bit more, to around 90%.



; After


== List of commits ==
== List of commits ==
My commits are listed below, under the folders with the name ''kmr'', which is the ISO code for Kurmanji Kurdish.

https://apertium.projectjj.com/gsoc2016/memduhg.html

== Testing the Product ==

The translation pair can be installed using the commands below.


<pre>
svn co https://svn.code.sf.net/p/apertium/svn/incubator/apertium-kmr-eng/
svn co https://svn.code.sf.net/p/apertium/svn/incubator/apertium-eng_feil/
svn co https://svn.code.sf.net/p/apertium/svn/incubator/apertium-kmr/
cd apertium-kmr
./autogen.sh
make
cd ..
cd apertium-eng_feil
./autogen.sh
make
cd ..
cd apertium-kmr-eng
./autogen.sh --with-lang1=../apertium-kmr --with-lang2=../apertium-eng_feil
</pre>


In the apertium-kmr-eng folder, echoing or cat'ing text with a pipe to <pre>apertium -d . kmr-eng</pre> will output an english translation.
<pre>
echo "Ez gelek kefxweş im ku min ji bo GSoC kar kir" | apertium -d . kmr-eng
I very happy #be that I #for *GSoC worked#
</pre>
[[Category:Kurdish and English|*]]
[[Category:Kurdish and English|*]]

Latest revision as of 10:29, 23 August 2016

This is the report for my 2016 Google Summer of Code project, Kurmanji-English Machine Translation.

What was done[edit]

My project was to improve significantly the preexisting pair, to around release quality. I have worked on adding vocabulary, disambiguation rules in CG, transfer rules and lexical selection.

The vocabulary was added from a number of sources, a few thousand were added from the work of Walther et al in their Kurmanji analyzer and POS tagger.

Adherence to work plan

I have largely followed and met the goals of the work plan, however in adding vocabulary in some cases I face difficulty in meeting the goals, due to lack of digital resources I was required to add translations one by one, using Kurdish-Turkish dictionaries and translating the meanings into English.

Statistics[edit]

Monodix Bidix CG Rules Transfer Paradigms Bilingual Coverage
Before 1433 11421 3 9 83 57%
After 17715 15597 97 23 157 85%

Future work[edit]

The most immediate concern for future work would be adding more transfer rules, in order to improve the quality of translations, and improving the coverage of the bilingual dictionary a bit more, to around 90%.


List of commits[edit]

My commits are listed below, under the folders with the name kmr, which is the ISO code for Kurmanji Kurdish.

https://apertium.projectjj.com/gsoc2016/memduhg.html

Testing the Product[edit]

The translation pair can be installed using the commands below.

svn co https://svn.code.sf.net/p/apertium/svn/incubator/apertium-kmr-eng/
svn co https://svn.code.sf.net/p/apertium/svn/incubator/apertium-eng_feil/
svn co https://svn.code.sf.net/p/apertium/svn/incubator/apertium-kmr/
cd apertium-kmr
./autogen.sh
make
cd ..
cd apertium-eng_feil
./autogen.sh
make
cd ..
cd apertium-kmr-eng
./autogen.sh --with-lang1=../apertium-kmr --with-lang2=../apertium-eng_feil

In the apertium-kmr-eng folder, echoing or cat'ing text with a pipe to

apertium -d . kmr-eng

will output an english translation.

echo "Ez gelek kefxweş im ku min ji bo GSoC kar kir" | apertium -d . kmr-eng
I very happy #be that I #for *GSoC  worked#