Difference between revisions of "Turkish and Azerbaijani"
Jump to navigation
Jump to search
(→Vowel harmony: remove) |
|||
(53 intermediate revisions by 6 users not shown) | |||
Line 1: | Line 1: | ||
{{TOCD}} |
{{TOCD}} |
||
==Noun morphology== |
|||
==Source== |
|||
Turkish has several cases: |
|||
absolute, definite-accusative, dative, locative, ablative, genitive |
|||
It also has pronominal clitics. |
|||
Typically these are applied in the following order: |
|||
# plural suffix |
|||
# suffix of possession |
|||
# case-ending |
|||
# personal suffix |
|||
<pre> |
<pre> |
||
https://github.com/apertium/apertium-tur-aze |
|||
kitap for ex. is the stem |
|||
https://github.com/apertium/apertium-aze |
|||
kitap + plural + pronoun |
|||
https://github.com/coltekin/TRmorph |
|||
kitaplar is the "books" |
|||
a noun has five cases |
|||
object direction is the "i case" |
|||
give me that book for ex. |
|||
bana o kitabı ver |
|||
"that book" |
|||
o kitabı |
|||
that is directed to object |
|||
from that book = o kitaptan |
|||
in that book = o kitapta |
|||
"from my book" |
|||
kitab+ım+dan |
|||
"from my books" |
|||
kitap+lar+ım+dan |
|||
</pre> |
</pre> |
||
==Todo== |
|||
==Agglutination case== |
|||
===Trmorph=== |
|||
verb= gitmek stem=git |
|||
# Check words in corpus which are analysed, but then have an apostrophe and unknown word after (e.g. case ending) ... perhaps they need to be added as proper nouns. e.g. Okyanusu'nun |
|||
<pre> |
|||
I'm going = gidiyorum (tr) |
|||
= gidirem (azerbaijani) |
|||
===Azmorph=== |
|||
gid+iyor+um (present continous, pr1, turkish) |
|||
# Add consonant harmony |--| Azerbaijani, beside having double vowel harmony like turkish, has consonant harmony. Q/K (as well as their devoiced version ğ/y) change according to the precedent vowel. |
|||
ge+di+rem (present continous, pr1, azerbaijani) |
|||
# Add final t voicing in d ex. getmək geDirəm |
|||
#Remove 040-exception_ben.fst |--| Unlike turkish, azerbaijani doesn't have irregular dative for men and sen (personal pronouns). |
|||
#Disambiguation sucks big time: I don't know why it doesn't take içerler as <v><t_aor><3pp> but as a noun. Need to fix ASAP |
|||
# Fix punctuation |
|||
# Subcategorise proper names as far as possible |
|||
# Interrogative mi |
|||
# Add da/de cnjcoo as d<A>, since it follows vowel harmony |
|||
# remove apostrophe from case endings after propernames etc. |
|||
===Other=== |
|||
git (lemma) -i -yor (for continous tense) -um (for first personal pronoun) (turkish) |
|||
git (lemma) -i -r(for continous tense) -em (for first personal pronoun) (azerbaijani) |
|||
</pre> |
|||
* add proper support for compound numerals in the bidix |
|||
==Test case== |
|||
* write up test cases on the wiki |
|||
* find a girlfriend for the boss |
|||
* the corpus is [http://elx.dlsi.ua.es/~fran/SETIMES/source/tr-en/setimes.tr here], you'll need to clean it. |
|||
* <s>make a script to trim the trmorph lexicon to the bidix... we don't want to analyse anything more than we can translate ! </s> |
|||
** '''Improve the trimming script(s)''' |
|||
* FIX: yemek <v> does weird thing with the vowel |
|||
* '''write disambiguation rules''' |
|||
* '''Subcategorise proper names as far as possible''': One way of doing this would be with Wikipedia -- at least for toponyms. |
|||
===Overgeneration=== |
|||
*Turkish: biram var. |
|||
*Azerbaijani: pivəm var |
|||
beer+p1 have |
|||
* <code>^<q><3s><dir>$</code> → midir/mudur/müdür/mıdır |
|||
I have a beer. |
|||
* <code>^Ye<v><pass><t_cont><3s>$</code> → Yeniliyor/Yeniyor |
|||
* <code>^ye<v><t_imp><2s>$</code> → ye/yesənə |
|||
* <code>^ölç<v><pass><abil><neg><t_aor><3s>$</code> → ölçüləbilmir/ölçüləmir |
|||
* <code> ^bir<num><D_sAr><adv>$</code> → bir'ər/birər |
|||
==See also== |
|||
*Turkish: iki biram var |
|||
*Azerbijani: iki pivəm var |
|||
* [[Turkish]] |
|||
two beer+p1 have |
|||
* [[/Pending tests|Pending tests]] |
|||
* [[/Regression tests|Regression tests]] |
|||
==External links== |
|||
I have two beers |
|||
* [http://www.tdk.org.tr/lehceler/Default.aspx Pan-Turkic dictionary] |
|||
===Noun=== |
|||
==Further reading== |
|||
* <code>abs</code> — absolute |
|||
* <code>dac</code> — definite-accusative |
|||
* <code>dat</code> — dative |
|||
* <code>abl</code> — ablative |
|||
* <code>loc</code> — locative |
|||
* <code>gen</code> — genitive |
|||
* <code>com</code> — comitative |
|||
* Vügar Sultanzade (????) ''Turkish - Azerbaijani Dictionary of Interlingual Homonyms and Paronyms'' |
|||
Underlined denotes the affix. |
|||
====Turkish==== |
|||
;Bira |
|||
{|class=wikitable |
|||
! person !! n.sg.abs !! n.sg.dac !! n.sg.dat !! n.sg.loc !! n.sg.abl !! n.sg.gen !! n.sg.com |
|||
|- |
|||
|''none''|| bira || bira<u>'''y'''ı</u> || bira<u>'''y'''a</u> || bira<u>da</u> || bira<u>dan</u> || bira<u>'''n'''ın</u> || bira<u>yla</u> |
|||
|- |
|||
| p1.sg || bira<u>m</u> || bira<u>mı</u> || bira<u>ma</u> || bira<u>mda</u>||bira<u>mdan</u> || bira<u>mın</u> || bira<u>mla</u> |
|||
|- |
|||
| p2.sg || bira<u>n</u> || bira<u>'''n'''ı</u> || bira<u>'''n'''a</u> || bira<u>nda</u> || bira<u>ndan</u> || bira<u>'''n'''ın</u> || bira<u>nla</u> |
|||
|- |
|||
| p3.sg || bira<u>sı</u> || bira<u>'''s'''ını</u> || bira<u>'''s'''ına</u> || bira<u>sında</u> || bira<u>sından</u> || bira<u>'''s'''ının</u> || bira<u>sıyla</u> |
|||
|- |
|||
| p1.pl || bira<u>mız</u> || bira<u>mızı</u> || bira<u>mıza</u> || bira<u>mızda</u> || bira<u>mızdan</u> || bira<u>mızın</u> || bira<u>mızla</u> |
|||
|- |
|||
| p2.pl || bira<u>nız</u> || bira<u>nızı</u> || bira<u>nıza</u> || bira<u>nızda</u> || bira<u>nızdan</u> || bira<u>nızın</u> || bira<u>nızla</u> |
|||
|- |
|||
| p3.pl || bira<u>sı</u> || bira<u>sını</u> || bira<u>sına</u> || bira<u>sında</u> || bira<u>sından</u> || bira<u>sının</u> || bira<u>sıyla</u> |
|||
|- |
|||
| || || || || || || <!-- nothing here --> |
|||
|- |
|||
! person !! n.pl.abs !! n.pl.dac !! n.pl.dat !! n.pl.loc !! n.pl.abl !! n.pl.gen !! n.pl.com |
|||
|- |
|||
|''none''|| bira<u>lar</u> || bira<u>ları</u> || bira<u>lara</u> || bira<u>larda</u> || bira<u>lardan</u> || bira<u>ların</u> || bira<u>larla</u> |
|||
|- |
|||
| p1.sg || bira<u>larım</u> || bira<u>larımı</u> || bira<u>larıma</u> || bira<u>larımda</u> || bira<u>larımda</u> || bira<u>larımın</u> || bira<u>larımla</u> |
|||
|- |
|||
| p2.sg || bira<u>ların</u> || bira<u>larını</u> || bira<u>larına</u> || bira<u>larında</u> || bira<u>larından</u> || bira<u>larının</u> || bira<u>larınla</u> |
|||
|- |
|||
| p3.sg || bira<u>ları</u> || bira<u>larını</u> || bira<u>larına</u> || bira<u>larında</u> || bira<u>larından</u> || bira<u>larının</u> || bira<u>larınla</u> |
|||
|- |
|||
| p1.pl || bira<u>larımız</u> || bira<u>larımızı</u> || bira<u>larımıza</u> || bira<u>larımızda</u> ||bira<u>larımızdan</u> || bira<u>larımızın</u> || bira<u>larmızla</u> |
|||
|- |
|||
| p2.pl || bira<u>larınız</u> || bira<u>larınızı</u> || bira<u>larınıza</u> || bira<u>larınızda</u> || bira<u>larınızdan</u> || bira<u>larınızın</u> || bira<u>larınızla</u> |
|||
|- |
|||
| p3.pl || bira<u>ları</u> || bira<u>larını</u> || bira<u>larına</u> || bira<u>larında </u> ||bira<u>larından</u> || bira<u>larının</u> || bira<u>larıyla</u> |
|||
|- |
|||
| || || || || || || <!-- nothing here --> |
|||
|} |
|||
The consonants with black are only there to combine the vowels next to them, they don't belong this form. If the stem (the noun in this case) ends with consonant, those extra letters will fall. For example if the word is tabut (which ends with the letter t) n.sg.dac without person case will be tabut'''u''' (u is because of the harmonization). |
|||
;Kitab |
|||
{|class=wikitable |
|||
! person !! n.sg.abs !! n.sg.dac !! n.sg.dat !! n.sg.loc !! n.sg.abl !! n.sg.gen !! n.sg.com| |
|||
|- |
|||
|"none"||kitab||kitabı||kitaba||kitaba||kitabdan ||kitabın ||kitabla |
|||
|- |
|||
| p1.sg ||kitabım||kitabımı||kitabıma||kitabımda||kitabımdan ||kitabımın ||kitabımla |
|||
|- |
|||
| p2.sg ||kitabın||kitabını||kitabına||kitabında||kitabdan ||kitabın ||kitabla |
|||
|- |
|||
| p3.sg ||kitabı||kitabını||kitabına||kitabında||kitabından ||kitabının ||kitabıyla |
|||
|- |
|||
| p1.pl ||kitabımız||kitabımızı||kitabımıza||kitabımızda||kitabımızdan ||kitabımızın ||kitabımızla |
|||
|- |
|||
| p2.pl ||kitabınız||kitabınızı||kitabınıza||kitabınızda||biranızdan ||kitabınızın ||kitabınızla |
|||
|- |
|||
| p3.pl ||kitabı||kitabını||kitabına||kitabında||kitabından||kitabının ||kitabıyla |
|||
|- |
|||
|- |
|||
| || || || || || || <!-- nothing here --> |
|||
|- |
|||
! person !! n.pl.abs !! n.pl.dac !! n.pl.dat !! n.pl.loc !! n.pl.abl !! n.pl.gen !! n.pl.com| |
|||
|- |
|||
|none||kitablar||kitabları||kitablara||kitablarda||kitablardan||kitabların||kitablarla |
|||
|- |
|||
| p1.sg ||kitablarım||kitablarımı||kitablarıma||kitablarımda||kitablarımda||kitablarımın||kitablarımla |
|||
|- |
|||
| p2.sg ||kitabların||kitablarını||kitablarına||kitablarında||kitablarından||kitablarının||kitablarınla |
|||
|- |
|||
| p3.sg ||kitabları||kitablarını||kitablarına||kitablarında||kitablarından||kitablarının||kitablarınla |
|||
|- |
|||
| p1.pl ||kitablarımız||kitablarımızı||kitablarımıza||kitablarımızda||kitablarımızdan||kitablarımızın||kitablarımızla |
|||
|- |
|||
| p2.pl ||kitablarınız||kitablarınızı||kitablarınıza||kitablarınızda||kitablarınızdan||kitablarınızın||kitablarınızla |
|||
|- |
|||
| p3.pl ||kitabları||kitablarını||kitablarına||kitablarında||kitablarından||kitablarının||kitablarıyla |
|||
|- |
|||
====Azerbaijani==== |
|||
{|class=wikitable |
|||
! person !! n.sg.abs !! n.sg.dac !! n.sg.dat !! n.sg.loc !! n.sg.abl !! n.sg.gen !! n.sg.com |
|||
|- |
|||
|''none''|| pivə || pivə<u>ni</u> ||pivə<u>yə</u> ||pivə<u>də</u> ||pivə<u>dən</u> || pivə<u>nin</u> ||pivə<u>ylə</u> |
|||
|- |
|||
| p1.sg || pivə<u>m</u> ||pivə<u>mi</u> ||pivə<u>mə</u> ||pivə<u>mdə</u> ||pivə<u>mdən</u> || pivə<u>min</u> ||pivə<u>mlə</u> |
|||
|- |
|||
| p2.sg || pivə<u>n</u> ||pivə<u>ni</u> ||pivə<u>nə</u> ||pivə<u>ndə</u> ||pivə<u>ndən</u> || pivə<u>nin</u> ||pivə<u>nlə</u> |
|||
|- |
|||
| p3.sg || pivə<u>si</u> ||pivə<u>sini</u> ||pivə<u>sinə</u> ||pivə<u>sində</u> ||pivə<u>sindən</u> || pivə<u>sinin</u> ||pivə<u>silə</u> |
|||
|- |
|||
| p1.pl || pivə<u>miz</u> ||pivə<u>iz</u> ||pivə<u>mizə</u> ||pivə<u>mizdə</u> ||pivə<u>mizdən</u> || pivə<u>mizin</u> ||pivə<u>mizlə</u> |
|||
|- |
|||
| p2.pl || pivə<u>niz</u> ||pivə<u>nizi</u> ||pivə<u>nizə</u> ||pivə<u>nizdə</u> ||pivə<u>nizdən</u> || pivə<u>nizin</u> ||pivə<u>nizlə</u> |
|||
|- |
|||
| p3.pl || pivə<u>si</u> ||pivə<u>sini</u> ||pivə<u>sinə</u> ||pivə<u>sində</u> ||pivə<u>sindən</u> || pivə<u>sinin</u> ||pivə<u>silə</u> |
|||
|- |
|||
| || || || || || || <!-- nothing here --> |
|||
|- |
|||
! person !! n.pl.abs !! n.pl.dac !! n.pl.dat !! n.pl.loc !! n.pl.abl !! n.pl.gen !! n.pl.com |
|||
|- |
|||
|''none''||pivə<u>lər</u>||Pivə<u>ləri</u>||Pivə<u>lərə</u>||Pivə<u>lərdə</u> ||Pivə<u>lərdən</u>||Pivə<u>lərin</u>||Pivə<u>lərlə</u> |
|||
|- |
|||
| p1.sg ||pivə<u>lərim</u>||Pivə<u>lərimi</u>||Pivə<u>lərəimə</u>||Pivə<u>lərimdə</u> ||Pivə<u>lərdimdən</u>||Pivə<u>lərimin</u>||Pivə<u>lərimlə</u> |
|||
|- |
|||
| p2.sg ||pivə<u>lərin</u>||Pivə<u>lərivi</u>||Pivə<u>lərəivə</u>||Pivə<u>lərində</u> ||Pivə<u>lərdindən</u>||Pivə<u>lərivin</u>||Pivə<u>lərinlə</u> |
|||
|- |
|||
| p3.sg ||pivə<u>ləri</u>||Pivə<u>lərini</u>||Pivə<u>lərəinə</u>||Pivə<u>lərində</u> ||Pivə<u>lərindən</u>||Pivə<u>lərinin</u>||Pivə<u>lərilə</u> |
|||
|- |
|||
| p1.pl ||pivə<u>lərimiz</u>||Pivə<u>lərimizi</u>||Pivə<u>lərəimizə</u>||Pivə<u>lərimizdə</u> ||Pivə<u>lərimizdən</u>||Pivə<u>lərimizin</u>||Pivə<u>lərimizlə</u> |
|||
|- |
|||
| p2.pl ||pivə<u>ləriniz</u>||Pivə<u>lərinizi</u>||Pivə<u>lərəinizə</u>||Pivə<u>lərinizdə</u> ||Pivə<u>lərinizdən</u>||Pivə<u>lərinizin</u>||Pivə<u>lərinizlə</u> |
|||
|- |
|||
| p3.pl ||pivə<u>ləri</u>||Pivə<u>lərini</u>||Pivə<u>lərəinə</u>||Pivə<u>lərində</u> ||Pivə<u>lərindən</u>||Pivə<u>lərinin</u>||Pivə<u>lərilə</u> |
|||
|- |
|||
| || || || || || || <!-- nothing here --> |
|||
|} |
|||
====Comparison==== |
|||
{|class=wikitable |
|||
! Turkish !! Azerbaijani !! Gloss !! Symbols |
|||
|- |
|||
| bira || pivə || beer || <code>n.sg</code> |
|||
|- |
|||
| biralar || pivələr || beers || <code>n.pl</code> |
|||
|- |
|||
| biram || pivəm || my beer || <code>n.sg.p1</code> |
|||
|- |
|||
| biralarım || pivələrim || my beers || <code>n.pl.p1</code> |
|||
|- |
|||
| biradan || pivədən || from the beer || <code>n.sg.abl</code> |
|||
|- |
|||
| biralardan || pivələrdən || from the beers || <code>n.pl.abl</code> |
|||
|- |
|||
| biramdan || pivəmdən || from my beer || <code>n.sg.p1.abl</code> |
|||
|- |
|||
| biralarımdan || pivələrimdən || from my beers || <code>n.pl.p1.abl</code> |
|||
|} |
|||
===Verb=== |
|||
{|class=wikitable |
|||
! Turkish !! Azerbaijani !! Gloss |
|||
|- |
|||
| içerim || içərəm || I drink |
|||
|- |
|||
| içersin || içərsən || You drink |
|||
|- |
|||
| içer || içər || He drinks |
|||
|- |
|||
| içer || içər || She drinks |
|||
|- |
|||
| içer || içər || It drinks |
|||
|- |
|||
| içersiniz || içərsiniz || You (pl.) drink |
|||
|- |
|||
| içeriz || içərik || We drink |
|||
|- |
|||
| içerler || içərlər || They drink |
|||
|} |
|||
==Symbols== |
|||
All tenses have 4 forms, |
|||
*Affirmative stative (<code>aff.stat</code>) |
|||
*Affirmative interrogative (<code>aff.int</code>) |
|||
*Negative stative (<code>neg.stat</code>) |
|||
*Negative interrogative (<code>neg.int</code>) |
|||
For each of 6 persons: |
|||
* First person, singular (<code>p1.sg</code>) |
|||
* Second person, singular (<code>p2.sg</code>) |
|||
* Third person, singular (<code>p3.sg</code>) |
|||
* First person, plural (<code>p1.pl</code>) |
|||
* Second person, plural (<code>p2.pl</code>) |
|||
* Third person, plural (<code>p3.pl</code>) |
|||
And here is the list by mood, then tense: |
|||
*'''Indicative mood''' |
|||
**Present Continuous Simple Tense (<code>pres.cont</code>) |
|||
**Simple (Aorist) Tense (<code>pres</code>) |
|||
**Past Definite |
|||
**Past Progressive, dubitative |
|||
**Indefinite Past (Past Aorist) |
|||
**Past Progressive, narrative (<code>past.cont.nar</code>) |
|||
**Past Perfect, narrative |
|||
**Doubtful Distant Past (<code>ddp</code>) |
|||
**Past in the Future |
|||
**Past Conditional, narrative (<code>past.cond.nar</code>) |
|||
**Past Conditional, dubitative (<code>past.cond.dub</code>) |
|||
**Future Simple |
|||
**Future in the Past |
|||
**Future dubitative |
|||
**Future conditional |
|||
*'''Imperative mood''' (<code>imp</code>) |
|||
**Simple Tense (<code>imp.pres</code>) |
|||
*'''Conditional mood''' (<code>cond</code>) |
|||
**Simple Tense (<code>cond.pres</code>) |
|||
**Present Progressive (<code>cond.cont</code> |
|||
**Aorist/Present |
|||
**Past Definite |
|||
**Indefinite Past |
|||
*'''Subjunctive mood''' (<code>subj</code>) |
|||
**Simple Tense (<code>subj.pres</code>) |
|||
**Past, narrative (<code>subj.past.nar</code>) |
|||
**Past, reportative (<code>subj.past.rep</code>) |
|||
*'''Necessitative mood tenses''' (<code>nec</code>) |
|||
**Simple Tense (<code>nec.pres</code>) |
|||
**Past, Narrative (<code>nec.past.nar</code>) |
|||
**Past, dubitative (<code>nec.past.dub</code>) |
|||
==Examples== |
|||
;Turkish |
|||
Bütün insanlar hür, haysiyet ve haklar bakımından eşit doğarlar. Akıl ve vicdana sahiptirler ve birbirlerine karşı kardeşlik zihniyeti ile hareket etmelidirler. |
|||
;Azerbaijani |
|||
Bütün insanlar ləyaqət və hüquqlarına görə azad və bərabər doğulurlar. Onların şüuralrı və vicdanları var və bir-birlərinə münasibətdə qardaşlıq runhunda davranmalıdırlar. |
|||
;Azerbaijani (''turkified'') |
|||
Bütün insanlar <u>azadlıq</u>, ləyaqət və haqlarına görə bərabər doğulurlar. Onların <u>ağılları</u> və vicdanları var və onlar bir-birlərinə münasibətdə qardaşlıq ruhunda davranmalıdırlar. |
|||
An example to show that there is not a widely need of word reorder. |
|||
;Turkish |
|||
Bilişim ve haberleşe teknolijlerinin asıl altyapı göstergelerinde biri olan telefon iletişimi ülke halkının en çok istifade ettiği iletişim aracı olmaya devam etmektedir. |
|||
;Azerbaijani |
|||
İnformasiya və kommunikasiya texnologiyalarının əsas infrastruktur göstəricilərindən biri olan telefon rabitəsi ölkə əhalisinin ən çox istifadə etdiyi rabitə vasitəsi olmakda davam edir. |
|||
;English |
|||
The telephone communication which is one the essential infrasturtucre indicators of information and telecommunication, continues to be one of the most utilized communication device by the people of the country. |
|||
==See also== |
|||
* [[Turkish]] |
|||
[[Category:Turkish to Azerbaijani]] |
[[Category:Turkish to Azerbaijani]] |
Latest revision as of 18:27, 8 March 2018
Source[edit]
https://github.com/apertium/apertium-tur-aze https://github.com/apertium/apertium-aze https://github.com/coltekin/TRmorph
Todo[edit]
Trmorph[edit]
- Check words in corpus which are analysed, but then have an apostrophe and unknown word after (e.g. case ending) ... perhaps they need to be added as proper nouns. e.g. Okyanusu'nun
Azmorph[edit]
- Add consonant harmony |--| Azerbaijani, beside having double vowel harmony like turkish, has consonant harmony. Q/K (as well as their devoiced version ğ/y) change according to the precedent vowel.
- Add final t voicing in d ex. getmək geDirəm
- Remove 040-exception_ben.fst |--| Unlike turkish, azerbaijani doesn't have irregular dative for men and sen (personal pronouns).
- Disambiguation sucks big time: I don't know why it doesn't take içerler as <v><t_aor><3pp> but as a noun. Need to fix ASAP
- Fix punctuation
- Subcategorise proper names as far as possible
- Interrogative mi
- Add da/de cnjcoo as d<A>, since it follows vowel harmony
- remove apostrophe from case endings after propernames etc.
Other[edit]
- add proper support for compound numerals in the bidix
- write up test cases on the wiki
- find a girlfriend for the boss
- the corpus is here, you'll need to clean it.
make a script to trim the trmorph lexicon to the bidix... we don't want to analyse anything more than we can translate !- Improve the trimming script(s)
- FIX: yemek <v> does weird thing with the vowel
- write disambiguation rules
- Subcategorise proper names as far as possible: One way of doing this would be with Wikipedia -- at least for toponyms.
Overgeneration[edit]
^
→ midir/mudur/müdür/mıdır<3s><dir>$
^Ye<v><pass><t_cont><3s>$
→ Yeniliyor/Yeniyor^ye<v><t_imp><2s>$
→ ye/yesənə^ölç<v><pass><abil><neg><t_aor><3s>$
→ ölçüləbilmir/ölçüləmir^bir<num><D_sAr><adv>$
→ bir'ər/birər
See also[edit]
External links[edit]
Further reading[edit]
- Vügar Sultanzade (????) Turkish - Azerbaijani Dictionary of Interlingual Homonyms and Paronyms