Difference between revisions of "Talk:Turkish and Azerbaijani"

From Apertium
Jump to navigation Jump to search
 
(18 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{TOCD}}
I think we will need new definitions for the five mentioned cases of turkish nouns

*tocase or tcase for short
*fromcase or fcase for short
*incase or icase for short
*thatcase or thcase for short
*purecase or pcase for short




== Comparative Resources ==
== Comparative Resources ==
Line 20: Line 12:
http://books.google.com/books?id=2f3yxBxf1TYC&pg=PA1&dq=turkic+dictionary&ei=u6bGRobjCI_g6wKSs-XfDw&sig=KmHmuF4VK5oc30rFxkBOHyuY0eg
http://books.google.com/books?id=2f3yxBxf1TYC&pg=PA1&dq=turkic+dictionary&ei=u6bGRobjCI_g6wKSs-XfDw&sig=KmHmuF4VK5oc30rFxkBOHyuY0eg


http://tdk.org.tr/lehceler/


Example for Noun cases:
Example for Noun cases:
Line 69: Line 62:


Okey... This won't suffice if <noun-1> corresponds to <noun-2> in second language??? confused.--[[User:Msalperen|Msalperen]] 22:18, 18 August 2007 (BST)
Okey... This won't suffice if <noun-1> corresponds to <noun-2> in second language??? confused.--[[User:Msalperen|Msalperen]] 22:18, 18 August 2007 (BST)


== Clitics ==


*YUM (Yumuşama)_NK: if the last 2 consonants at the and of the stem is nk, turn the last k into g
Example: Renk (stem) + i = Rengi (Not Renki)
Named entities do not have this effect.
*YUM (Yumuşama): if the consonant at the and is p , or ç,t,k respectively and the starting letter of the affix is a vowel,make them b, c, d ,ğ
Kitap + ı = (Kitabı), Güç + ü (harmonized i) = Gücü
*DUS : While affixing, drop the last vowel, and connect the rest
(nutuk with affix -a = nutka), ufuk=>ufuka=>ufka, alın=>alını=>alnı
*TERS : Just reverse the harmonization, if the word ends with letter a, start the affix with e (or the reverse)
-saat + a becomes saate (it should "saata" because of harmonization but make it reverse),
-işgal + a = işgale ("saata" makes sense, but let's do it saate)
This irregularity comes especially with Arabic or Persian words

'''More About Clitics''':

*http://www.informaworld.com/smpp/content~content=a740968487~db=all~jumptype=rss

*3nokta.wordpress.com/2007/03/19/ses-olaylari/ (currently unavailable)

*http://www.egze.com/download/uploads/ses_olaylari.zip (virus tested, clear)

*Efficient Find-and-Replace in Agglutinative Languages:
The Case of Turkish
www.hlst.sabanciuniv.edu/archive/patras.doc

== Other Indicators ==

YAL : The word (probably adj. or conj.) only used in stem form, does not take any kind of affixes
GEN : Exception of present tense (will be explained later)

Latest revision as of 22:53, 30 March 2011

Comparative Resources[edit]

A Comparison of Modern Azeri With By Kurtulush Oztopchu – Berkeley/UCLA

http://azer.com/aiweb/categories/magazine/13_folder/13_articles/kurtulush_azeri_turkish_13.pdf

Dictionary of the Turkic Languages:

http://books.google.com/books?id=2f3yxBxf1TYC&pg=PA1&dq=turkic+dictionary&ei=u6bGRobjCI_g6wKSs-XfDw&sig=KmHmuF4VK5oc30rFxkBOHyuY0eg

http://tdk.org.tr/lehceler/

Example for Noun cases:

http://www.ingilish.com/turkishnouncase.htm

The posession phrase part of this page also has some extra information with genitive and comitative forms http://www.ingilish.com/turkishpasttense.htm

Comitative Case The comitative case also should be modelled. Because it also goes inflected with the noun.--Msalperen 10:16, 18 August 2007 (BST)

http://www.turkishlanguage.co.uk/fiilkipi.htm http://www.practicalturkish.com/turkish-verb--literature.html

Azeri Vowel and Consonant Harmony: http://home.unilang.org/wiki3/index.php/Azeri_vowel_and_consonant_harmony

Harmonization Issue[edit]

We can define <noun-2> which ends with one kind of vowel (let's say e) and define new kinds of affixes (<sg2-2> for example) that only follows the second kind of nouns. would it be a solution for harmonization? There're are not so much type of vowels when it comes to harmonization. In order not to confuse noun2 with noun-2 as already seen we can use a dash (-) sign. Let's say bira is a type 1 noun and lar is the first form of plural maker. It's obvious to us that pl-1 will never follow noun-1 however, we can define noun-1 in the dictionary so:

  • biralarım = <noun-1><pl-1><sg1>
  • and ev is the second type
  • evlerim=<noun-2><pl-2><sg-2>
  • There are 8 vowels and only two kinds of plural makers in Turkish so it would be
  • for a,ı,o and u (so called tight vowels)
  • stems: bira, koro, kız, muz
  • biralarım = <noun-1><pl-1><p1> (my beers)
  • korolarım = <noun-1><pl-1><p1> (my choirs)
  • kızlarım = <noun-1><pl-1><p1> (my daughters)
  • muzlarım = <noun-1><pl-1><p1> (my bananas)

for e,i,ö, and ü (so called "tin" vowels)

  • stems: ev, kedi, göz, düş
  • evlerim = <noun-2><pl-2><p-1>
  • kedilerim = <noun-2><pl-2><p-1>
  • gözlerim = <noun-2><pl-2><p-1>
  • düşlerim = <noun-2><pl-2><p-1>

if the last vowel of the stem is a it will always be type 1 vowel, and the following affixes will always be type 1 and the same logic for the second type of nouns. Just in the xml file, we will define the word as noun-1 or noun-2 not only single identifier "<n>". I think this solution will require only a few more definitions or paradigms. --Msalperen 21:17, 18 August 2007 (BST)

Okey... This won't suffice if <noun-1> corresponds to <noun-2> in second language??? confused.--Msalperen 22:18, 18 August 2007 (BST)


Clitics[edit]

  • YUM (Yumuşama)_NK: if the last 2 consonants at the and of the stem is nk, turn the last k into g

Example: Renk (stem) + i = Rengi (Not Renki) Named entities do not have this effect.

  • YUM (Yumuşama): if the consonant at the and is p , or ç,t,k respectively and the starting letter of the affix is a vowel,make them b, c, d ,ğ

Kitap + ı = (Kitabı), Güç + ü (harmonized i) = Gücü

  • DUS : While affixing, drop the last vowel, and connect the rest

(nutuk with affix -a = nutka), ufuk=>ufuka=>ufka, alın=>alını=>alnı

  • TERS : Just reverse the harmonization, if the word ends with letter a, start the affix with e (or the reverse)

-saat + a becomes saate (it should "saata" because of harmonization but make it reverse), -işgal + a = işgale ("saata" makes sense, but let's do it saate) This irregularity comes especially with Arabic or Persian words

More About Clitics:

  • 3nokta.wordpress.com/2007/03/19/ses-olaylari/ (currently unavailable)
  • Efficient Find-and-Replace in Agglutinative Languages:

The Case of Turkish www.hlst.sabanciuniv.edu/archive/patras.doc

Other Indicators[edit]

YAL : The word (probably adj. or conj.) only used in stem form, does not take any kind of affixes GEN : Exception of present tense (will be explained later)