Difference between revisions of "Uighur and Turkish/Paper"

From Apertium
Jump to navigation Jump to search
 
(23 intermediate revisions by the same user not shown)
Line 9: Line 9:
 
=== Images and Diagrams ===
 
=== Images and Diagrams ===
 
== Related Work ==
 
== Related Work ==
  +
'''SEARCH ON CNKI + GOOGLE TRANSLATE'''
 
=== Uyghur-specific ===
 
=== Uyghur-specific ===
 
==== Chinese-Uyghur ====
 
==== Chinese-Uyghur ====
* [http://en.cnki.com.cn/Article_en/CJFDTOTAL-JSJY200907077.htm Chinese-Uyghur machine translation system for phrase-based statistical translation]
+
* [http://en.cnki.com.cn/Article_en/CJFDTOTAL-JSJY200907077.htm Chinese-Uyghur machine translation system for phrase-based statistical translation] [https://scholar.googleusercontent.com/scholar.bib?q=info:i2M6ziEGJygJ:scholar.google.com/&output=citation&scisig=AAGBfm0AAAAAW0WtH2lkp-FHPUOgqLdd6dyCXCX64S-3&scisf=4&ct=citation&cd=-1&hl=tr BiBTeX]
* [http://en.cnki.com.cn/Article_en/CJFDTOTAL-JSJC201109006.htm Phrase-based Chinese-Uyghur/Uyghur-Chinese Statistical Machine Translation]
+
* [http://en.cnki.com.cn/Article_en/CJFDTOTAL-JSJC201109006.htm Phrase-based Chinese-Uyghur/Uyghur-Chinese Statistical Machine Translation] [https://scholar.googleusercontent.com/scholar.bib?q=info:loTIblx2Xb8J:scholar.google.com/&output=citation&scisig=AAGBfm0AAAAAW0Ws9Z0wEvPxTSb8oxJcmqcuhVMRlev_&scisf=4&ct=citation&cd=-1&hl=tr BibTeX]
  +
* [https://ieeexplore.ieee.org/abstract/document/5666183/ Chinese-uyghur statistical machine translation: The initial explorations] - having trouble finding this but it would be good to read [https://scholar.googleusercontent.com/scholar.bib?q=info:L4XMl70eb7MJ:scholar.google.com/&output=citation&scisig=AAGBfm0AAAAAW0Ws1cch02S87veBQRuAWQAccylv95tY&scisf=4&ct=citation&cd=-1&hl=tr BiBTeX]
* [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.614.5731&rep=rep1&type=pdf Rule Based Analysis of the Uyghur Nouns]
 
 
* [http://en.cnki.com.cn/Article_en/CJFDTOTAL-JYRJ201203060.htm DESIGNING RELATED TRANSFORMATION AND MATCHING RULES FOR CHINESE-UYGHUR MACHINE TRANSLATION] [https://scholar.googleusercontent.com/scholar.bib?q=info:PH1KcjjGW1EJ:scholar.google.com/&output=citation&scisig=AAGBfm0AAAAAW0Wspyzh5chZn9T7jD9kvZqiqNGB5fGN&scisf=4&ct=citation&cd=-1&hl=tr BiBTeX]
* [https://ieeexplore.ieee.org/abstract/document/5666183/ Chinese-uyghur statistical machine translation: The initial explorations]
 
  +
* [http://www.apsipa.org/proceedings/2017/CONTENTS/papers2017/14DecThursday/TP-04/TP-04.1.pdf Memory-augmented Chinese-Uyghur Neural Machine Translation] - This looks interesting, 200K sentences of bilingual data collected, we should contact the authors to see if we can access it [https://scholar.googleusercontent.com/scholar.bib?q=info:xfOwLXHNY-oJ:scholar.google.com/&output=citation&scisig=AAGBfm0AAAAAW0WsXtH_pCub9w2ljxwrPaz9VQIZIaLO&scisf=4&ct=citation&cd=-1&hl=tr BiBTeX]
* [http://en.cnki.com.cn/Article_en/CJFDTOTAL-JYRJ201203060.htm DESIGNING RELATED TRANSFORMATION AND MATCHING RULES FOR CHINESE-UYGHUR MACHINE TRANSLATION]
 
  +
* [http://aclweb.org/anthology/I17-3008 XMU Neural Machine Translation Online Service] - Also has uyghur, web interface [http://nmt.cloudtrans.org/ here], but unclear wrt details of data/evals [https://scholar.googleusercontent.com/scholar.bib?q=info:A6cMdf1SuHwJ:scholar.google.com/&output=citation&scisig=AAGBfm0AAAAAW0WsiJ1Dq9p75V5KHHvesu1gfIufLIw2&scisf=4&ct=citation&cd=-1&hl=tr BiBTeX]
* [http://www.apsipa.org/proceedings/2017/CONTENTS/papers2017/14DecThursday/TP-04/TP-04.1.pdf Memory-augmented Chinese-Uyghur Neural Machine Translation] - This looks interesting
 
   
 
==== Japanese-Uyghur ====
 
==== Japanese-Uyghur ====
Line 27: Line 28:
 
==== Non-english Research ====
 
==== Non-english Research ====
 
Japanese:
 
Japanese:
Muhsut Muhtar seems to have worked a lot on japanese-uighur translation in japanese universities.
+
Muhtar Muhsut seems to have worked a lot on japanese-uighur translation in japanese universities.
 
* [https://ci.nii.ac.jp/naid/110003278410/ A PARAMETERIZED APPROACH TO PROCESSING AUXILIARY VERBS IN JAPANESE- UIGHUR MACHINE TRANSLATION] [https://scholar.googleusercontent.com/scholar.bib?q=info:aR2EhSMTlmIJ:scholar.google.com/&output=citation&scisig=AAGBfm0AAAAAWztYDJcz7_BhfJiy7GNOmpfVvMIxh3-i&scisf=4&ct=citation&cd=-1&hl=tr BibTeX]
 
* [https://ci.nii.ac.jp/naid/110003278410/ A PARAMETERIZED APPROACH TO PROCESSING AUXILIARY VERBS IN JAPANESE- UIGHUR MACHINE TRANSLATION] [https://scholar.googleusercontent.com/scholar.bib?q=info:aR2EhSMTlmIJ:scholar.google.com/&output=citation&scisig=AAGBfm0AAAAAWztYDJcz7_BhfJiy7GNOmpfVvMIxh3-i&scisf=4&ct=citation&cd=-1&hl=tr BibTeX]
 
* [https://ci.nii.ac.jp/naid/110002911592 Paraphrasing Japanese Words to Expand a Japanese - Uighur Bilingual Dictionary ]
 
* [https://ci.nii.ac.jp/naid/110002911592 Paraphrasing Japanese Words to Expand a Japanese - Uighur Bilingual Dictionary ]
 
* [https://www.jstage.jst.go.jp/article/jnlp1994/8/3/8_3_123/_article/-char/ja/ Conversion process of case particles for Japanese-Uighur machine translation] - There is an english version of this above
 
* [https://www.jstage.jst.go.jp/article/jnlp1994/8/3/8_3_123/_article/-char/ja/ Conversion process of case particles for Japanese-Uighur machine translation] - There is an english version of this above
Turkish:
+
'''Turkish''':
* [https://polen.itu.edu.tr/xmlui/bitstream/handle/11527/527/10459.pdf?sequence=1&isAllowed=y Uygurcadan Türkçeye bilgisayarlı çeviri.] [https://scholar.googleusercontent.com/scholar.bib?q=info:zmL5BQqzH04J:scholar.google.com/&output=citation&scisig=AAGBfm0AAAAAWzsR0fI2a0hOeD13Uc1ofWmW8gYQMlf6&scisf=4&ct=citation&cd=-1&hl=tr BibTex] - We should probably find and cite this, and also look into the other stuff Orhun/Tantuğ/Adalı have been up to.
+
* [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.614.5731&rep=rep1&type=pdf Rule Based Analysis of the Uyghur Nouns] [https://scholar.googleusercontent.com/scholar.bib?q=info:IPwR6tYl7u8J:scholar.google.com/&output=citation&scisig=AAGBfm0AAAAAW0Wr46dRPOwCNffhZ3f09Pmh96Cv0DwQ&scisf=4&ct=citation&cd=-1&hl=tr BibTex]
  +
* [https://tez.yok.gov.tr/UlusalTezMerkezi YÖK] Thesis #290286: Description of tatar morphology and a Tatar-Turkish machine translation system / Tatarcanın morfolojisinin tanımlanması ve Tatarca-Türkçe makine çeviri sistemi - ERCAN GÖKGÖZ
* [https://polen.itu.edu.tr/handle/11527/518 Akraba Ve Bitişken Diller Arasında Bilgisayarlı Çeviri İçin Karma Bir Model] [https://scholar.googleusercontent.com/scholar.bib?q=info:5rSIKuO_cR4J:scholar.google.com/&output=citation&scisig=AAGBfm0AAAAAWzsVAXD0j27q7UXvnKrisWh5Eft5EKd8&scisf=4&ct=citation&cd=-1&hl=tr BibTeX]
 
  +
* [https://tez.yok.gov.tr/UlusalTezMerkezi YÖK] Thesis #343000: Text conversion system between Turkic dialects / Türk lehçeleri arasında çeviri sistemi - EMEL ALKIM
  +
* [https://tez.yok.gov.tr/UlusalTezMerkezi YÖK] Thesis #318598: Two level Uyghur morphology and Uyghur Turkish machine translation / İki düzeyli Uygur morfolojisi ve Uygur Türkçe makine çevirisi - RÜMEYSA KESKİN
  +
* [https://tez.yok.gov.tr/UlusalTezMerkezi YÖK] Thesis #293849: Uygurcadan Türkçeye bilgisayarlı çeviri - Murat Orhun
  +
* [https://tez.yok.gov.tr/UlusalTezMerkezi YÖK] Thesis #223702: Akraba Ve Bitişken Diller Arasında Bilgisayarlı Çeviri İçin Karma Bir Model - A. Cüneyd Tantuğ
  +
* Tantuğ Turkmen?
  +
* [http://journals.manas.edu.kg/mjen/archives/Y2015_V3_I2/043327ce78b3213929418147b6fe2a7c.pdf T. Nakılay, M. TEKEREK, and U. BRİMKULOV, “Kırgız ve Türkiye Türkçeleri arasında istatistiksel bilgisayarlı çeviri uygulaması ve başarım testi,” Manas Journal of Engineering, vol. 3, no. 2, pp. 59–68, Dec. 2015.] - YÖK akademik
  +
* A. Eşref, A. C. TANTUĞ, and M. ORHUN, “Türk dilleri arasında makineli çeviri sistemleri Türkmence ve Uygurcadan Türkçeye Çeviri Programı ,” presented at the Uluslar Arası Türk Lehçeleri Arasında Aktarma ve Uygulama Sempozyumu, Maltepe Üniversitesi, Istanbul, 2009. - YÖK akademik
  +
* M. ORHUN, A. C. TANTUĞ, and A. Eşref, “Uygucada Biçimbilimsel Belirsizlik,” presented at the Akademik Bililşim 2010, Muğla Üniversitesi, Muğla, Türkiye , 2010. - YÖK akademik
  +
* M. ORHUN, “Uygur Tümcesinin Bilgisayar ile Çözümlenmesi,” presented at the Akademik Bilişim, 2013, Akdeniz Üniversitesi, Antalya, Türkiye , 2013. - YÖK akademik
  +
* Computational Comparison of the Uyghur and Turkish Grammar (Murat Orhun, Eşref Adalı, A. Cüneyd Tantuğ), In IEEE International Conference On Computer Science and Information Technology (ICCSIT 2009), 2009.
  +
* Machine translation from Turkmen language to Turkish (A. Cüneyd Tantuğ, Eşref Adalı), In İTÜ Dergisi (D), volume 7, 2008.
  +
* A MT System From Turkmen to Turkish Employing Finite State And Statistical Methods (A. Cüneyd Tantuğ, Eşref Adalı, Kemal Oflazer), In MT Summit XI, 2007. [bibtex]
  +
* A Prototype Machine Translation System Between Turkmen and Turkish (A. Cüneyd Tantuğ, Eşref Adalı, Kemal Oflazer), In Fifteenth Turkish Symposium on Artificial Intelligence and Neural Networks, TAINN, 2006. [bibtex]
  +
* [https://tez.yok.gov.tr/UlusalTezMerkezi YÖK] Thesis #29910: Machine translation from Turkish to other Turkic languages and an implementation for the Azeri language - İLKER HAMZAOĞLU
  +
* [https://tez.yok.gov.tr/UlusalTezMerkezi YÖK] Thesis #430917: Example based machine translation system between kazakh and turkish supported by statistical language model / Kazakça ve türkçe dilleri arasında örnek tabanlı ve istatistik model destekli makine çeviri sistemi - GULSHAT KESSİKBAYEVA
  +
* [https://tez.yok.gov.tr/UlusalTezMerkezi YÖK] Thesis #182996: Türkçe - Türkmence bilgisayarlı çeviri sistemi / A Turkish - Turkmen machine translation system - GUYCHMYRAT AMANMYRADOV
  +
* [https://tez.yok.gov.tr/UlusalTezMerkezi YÖK] Thesis #244906: Turkish and Turkmen morphological analyzer and machine translation program / Türkçe ve Türkmence biçimbirimsel çözümleme ve makine çeviri programı - MAXİM SHYLOV
   
 
==== Other ====
 
==== Other ====
Line 43: Line 61:
 
* ADD ALTINTAS CRH STUFF
 
* ADD ALTINTAS CRH STUFF
 
* Hamzaoğlu azeri
 
* Hamzaoğlu azeri
  +
* Turkish–Tatar (Gilmullin, 2008)
 
* [http://research.sabanciuniv.edu/6395/1/MT_Summit_XI.pdf A mt system from turkmen to turkish employing finite state and statistical methods] [https://scholar.googleusercontent.com/scholar.bib?q=info:caLgCiod5WcJ:scholar.google.com/&output=citation&scisig=AAGBfm0AAAAAWzsRAg7jw3U_Q33AiSdfytCSPNJqWFm0&scisf=4&ct=citation&cd=-1&hl=tr BibTeX]
 
* [http://research.sabanciuniv.edu/6395/1/MT_Summit_XI.pdf A mt system from turkmen to turkish employing finite state and statistical methods] [https://scholar.googleusercontent.com/scholar.bib?q=info:caLgCiod5WcJ:scholar.google.com/&output=citation&scisig=AAGBfm0AAAAAWzsRAg7jw3U_Q33AiSdfytCSPNJqWFm0&scisf=4&ct=citation&cd=-1&hl=tr BibTeX]
   
 
=== Fun ===
 
=== Fun ===
 
* [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.22.2342&rep=rep1&type=pdf PARTICLE-BASED MACHINE TRANSLATION FOR ALTAIC LANGUAGES :THE JAPANESE-UIGHUR CASE]
 
* [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.22.2342&rep=rep1&type=pdf PARTICLE-BASED MACHINE TRANSLATION FOR ALTAIC LANGUAGES :THE JAPANESE-UIGHUR CASE]
  +
  +
=== Tangentially Related ===
  +
* [https://benthamopen.com/contents/pdf/TOCSJ/TOCSJ-8-739.pdf Uyghur-Chinese Translation Disambiguation Method Research Based on Knowledge Automatic-Acquisition]
  +
* [https://pdfs.semanticscholar.org/7df2/98eeb3dba50db03be3ac47e8c6ae82a87f2a.pdf Uyghur Language Model with Graphic Structure]
  +
* [http://lrec-conf.org/workshops/lrec2018/W34/pdf/14_W34.pdf Construction of Uyghur named entity corpus]
  +
* [http://en.cnki.com.cn/Article_en/CJFDTOTAL-MESS201004017.htm Conceptual Design of Uyghur FrameNet]
   
 
== Evaluations ==
 
== Evaluations ==
Line 55: Line 80:
   
 
=== BLEU ===
 
=== BLEU ===
  +
  +
=== Various Potential Corpora ===
  +
* ELRA Uyghur Text corpus - they're not too eager to provide this
  +
* [http://www.aclweb.org/anthology/Y03-1025 The development of tagged Uyghur corpus]
  +
* [http://www.lancaster.ac.uk/fass/projects/corpus/UCCTS2008Proceedings/papers/Mamitimin_and_Dawut.pdf Chinese-Uyghur Parallel Corpus Construction and its Application]
   
 
== Examples ==
 
== Examples ==

Latest revision as of 07:00, 14 August 2018

Info related to paper on uig-tur RBMT.

TO DO[edit]

  • Fill out latex template
  • Evaluate system on corpora

To write[edit]

  • Intro
  • Evals

Images and Diagrams[edit]

Related Work[edit]

SEARCH ON CNKI + GOOGLE TRANSLATE

Uyghur-specific[edit]

Chinese-Uyghur[edit]

Japanese-Uyghur[edit]

Non-english Research[edit]

Japanese: Muhtar Muhsut seems to have worked a lot on japanese-uighur translation in japanese universities.

Turkish:

  • Rule Based Analysis of the Uyghur Nouns BibTex
  • YÖK Thesis #290286: Description of tatar morphology and a Tatar-Turkish machine translation system / Tatarcanın morfolojisinin tanımlanması ve Tatarca-Türkçe makine çeviri sistemi - ERCAN GÖKGÖZ
  • YÖK Thesis #343000: Text conversion system between Turkic dialects / Türk lehçeleri arasında çeviri sistemi - EMEL ALKIM
  • YÖK Thesis #318598: Two level Uyghur morphology and Uyghur Turkish machine translation / İki düzeyli Uygur morfolojisi ve Uygur Türkçe makine çevirisi - RÜMEYSA KESKİN
  • YÖK Thesis #293849: Uygurcadan Türkçeye bilgisayarlı çeviri - Murat Orhun
  • YÖK Thesis #223702: Akraba Ve Bitişken Diller Arasında Bilgisayarlı Çeviri İçin Karma Bir Model - A. Cüneyd Tantuğ
  • Tantuğ Turkmen?
  • T. Nakılay, M. TEKEREK, and U. BRİMKULOV, “Kırgız ve Türkiye Türkçeleri arasında istatistiksel bilgisayarlı çeviri uygulaması ve başarım testi,” Manas Journal of Engineering, vol. 3, no. 2, pp. 59–68, Dec. 2015. - YÖK akademik
  • A. Eşref, A. C. TANTUĞ, and M. ORHUN, “Türk dilleri arasında makineli çeviri sistemleri Türkmence ve Uygurcadan Türkçeye Çeviri Programı ,” presented at the Uluslar Arası Türk Lehçeleri Arasında Aktarma ve Uygulama Sempozyumu, Maltepe Üniversitesi, Istanbul, 2009. - YÖK akademik
  • M. ORHUN, A. C. TANTUĞ, and A. Eşref, “Uygucada Biçimbilimsel Belirsizlik,” presented at the Akademik Bililşim 2010, Muğla Üniversitesi, Muğla, Türkiye , 2010. - YÖK akademik
  • M. ORHUN, “Uygur Tümcesinin Bilgisayar ile Çözümlenmesi,” presented at the Akademik Bilişim, 2013, Akdeniz Üniversitesi, Antalya, Türkiye , 2013. - YÖK akademik
  • Computational Comparison of the Uyghur and Turkish Grammar (Murat Orhun, Eşref Adalı, A. Cüneyd Tantuğ), In IEEE International Conference On Computer Science and Information Technology (ICCSIT 2009), 2009.
  • Machine translation from Turkmen language to Turkish (A. Cüneyd Tantuğ, Eşref Adalı), In İTÜ Dergisi (D), volume 7, 2008.
  • A MT System From Turkmen to Turkish Employing Finite State And Statistical Methods (A. Cüneyd Tantuğ, Eşref Adalı, Kemal Oflazer), In MT Summit XI, 2007. [bibtex]
  • A Prototype Machine Translation System Between Turkmen and Turkish (A. Cüneyd Tantuğ, Eşref Adalı, Kemal Oflazer), In Fifteenth Turkish Symposium on Artificial Intelligence and Neural Networks, TAINN, 2006. [bibtex]
  • YÖK Thesis #29910: Machine translation from Turkish to other Turkic languages and an implementation for the Azeri language - İLKER HAMZAOĞLU
  • YÖK Thesis #430917: Example based machine translation system between kazakh and turkish supported by statistical language model / Kazakça ve türkçe dilleri arasında örnek tabanlı ve istatistik model destekli makine çeviri sistemi - GULSHAT KESSİKBAYEVA
  • YÖK Thesis #182996: Türkçe - Türkmence bilgisayarlı çeviri sistemi / A Turkish - Turkmen machine translation system - GUYCHMYRAT AMANMYRADOV
  • YÖK Thesis #244906: Turkish and Turkmen morphological analyzer and machine translation program / Türkçe ve Türkmence biçimbirimsel çözümleme ve makine çeviri programı - MAXİM SHYLOV

Other[edit]

Turkic[edit]

Fun[edit]

Tangentially Related[edit]

Evaluations[edit]

Since there seem to have been a number of prototypes, mostly for Chinese-Uyghur, it might make sense to compare these values with any evaluation those systems may have, and also the Turkic Apertium pairs crh-tur, kaz-tur, kaz-tat and gag-tur.

  • Adalı &co seem to have worked a good bit on Uyghur analysis and disambiguation, so that would also be a good thing to compare.

WER[edit]

BLEU[edit]

Various Potential Corpora[edit]

Examples[edit]