Difference between revisions of "Morphology of Tatar language"

From Apertium
Jump to navigation Jump to search
 
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
= Phonology =
== Vowel harmony ==
== Vowel harmony ==
: ''See [[Vowel harmony]] and [http://en.wikipedia.org/wiki/Vowel_harmony Wikipedia]''
: ''See [[Vowel harmony]] and [http://en.wikipedia.org/wiki/Vowel_harmony Wikipedia]''

; Rounding harmony
Rounding harmony is not represented in writing and is therefore not discussed here.


; Backness harmony
; Backness harmony
There are two classes of vowels in Tatar — front and back. Backness harmony states that words may not contain both front and back vowels.
There are two classes of vowels in Tatar — front and back. Backness harmony states that words may not contain both front and back vowels. This is a very important rule, because not only stems follow it, every affix does. That means, that every affix in Tatar exists in at least two variants — with front vowel and with a back vowel (other variants may exist because of consonant alternations).


But this is the most general rule. There are some exceptions. They are described below and ideas are given, how we can handle them using twolc rules.
; Rounding harmony
Rounding harmony isn't represented in writing and is therefore not discussed here.


=== Exceptions ===
=== Exceptions. Case history ===


All exceptions are basically of three types:
; Loanwords
# Loanwords. They may not follow the backness harmony rule and both front and back vowels can occur in a given word
* икътисад-ы-на, сәркатиб-е-нә
# Also loanwords, but another problem. There are letters in Tatar alphabet, which correspond to more than one consonant: 'к' and 'г'. They can either stand for velar [k] and [g] or for uvular [q] and [ğ] respectively<ref>In this article instead of phonetic alphabet symbols we use letters from Yaŋalif 2. It has letters for this oppositions and is enough to make it clear</ref>. This doesn't cause many problems in native Tatar words because of another assimilation process, which states that velar variants are used only in words with front vowels and uvular variants only in words with back vowels. Again, this is not true for loan words (Arabic and Farsi). And because there are no special letters for this uvular consonants (they are used in Bashkir e.g.), sometimes 'а', 'ы', 'о' and 'у' are used to denote that the preceding consonant is an uvular one and a soft sign 'ь' at the end of the word denotes that this letters are pronounced as [ә], [е], [ö] and [ü] respectively.
** suffix harmonizes with the last syllable
# Letters 'я' 'ю' 'е' in initial position. They can stand for both - [ya]/[yә], [yu]/[yü] and [yı]/[ye]


=== Exceptions. Possible situations ===
* табигать-кә, җәмгыять-кә, секретар(ь)-енә, табигат(ь)-е
And now let's describe every imaginable situation — from general cases to more specific ones and give some rules, following which we can decide, what variant of an affix transducer has to choose. This description doesn't try to be minimalistic, no doubt that rules can be implemented in the actual twolc file in a more concise way. Actually, the reason of listing out every possible situation (well, at least most of them) is to help to decide the most beautiful way of implementing them in twolc.
** -ь denotes that -а- and -я- are pronunced as [ә] and [jә] respectively (at least similar to them)
** -ь is deleted before suffixes starting with a vowel


# The word has only back vowels or only front vowels: бала-лар-ыбыз-ның, җибәр-гән-нәр-дер
; Letters 'е' 'ю' 'я'
## The simplest case. Twolc rule would be — replace archivowels with their front-vowel realization in words with only front vowels
* ел+ы ''but'' егет+кә, гаеп+ле
# The word has a soft sign 'ь' at the end: табигать, секретарь, гаять
** having 'е' in stem is not enough to decide which variant of suffix (back or front) to choose
## Choose front-vowel variant of affixes no matter what kind of vowels it contain (replace archivowels with their front-vowel variant): секретарь-гә, табигать-кә, шөгыль-ләр-е, гаять тә
## delete 'ь' before vowels: табигат-е-нә, секретар-е
# Loanwords<ref>Maybe a better idea is to surrender and make a sublexicon of Arabic and Persian loanwords (at least of ones containing two-sound letters 'я' 'ю' and 'я') with an additional symbol denoting the backness of joined affixes</ref>: икътисад-ы-на, сәркатиб-е-нә
## suffix harmonizes with the last syllable: replace archivowels with their front-vowel realizations, in words where the last vowel of the stem is front<ref>This won't work for e.g. ''coциализм'' > ''социализм-га''</ref>
# Letter 'я'
## 'я' at the beginning of the word and there are no other vowels after it<ref name="Vowel follows">If there is a vowel our job is already done — it will determine what variant of the suffix to choose</ref> and there is no soft sign after it — 'я' will stand for [ya]: ял-ы
## 'я' at the beginning of a word and there is a soft sign at the end of the word — it will stand for [yә]: яшь+е > яше
## 'я' following a back vowel — stands for [ya]: уян-ырга, (формуляр-ы)
## 'я' following a front vowel — stands for [yә]: сөял-ергә
#The same will be true for letter 'ю':
## 'ю' at the beginning of the word and there are no other vowels after it<ref name="Vowel follows"/> and there is no soft sign after it — 'ю' will stand for [yu]: ю-ды-м
## 'ю' at the beginning of a word and there is a soft sign at the end of the word — it will stand for [yü]: юнь-рәк, юнь+е > юне
## 'ю' following a back vowel — stands for [yu]: аю-га
## 'ю' following a front vowel — stands for [yü]: сөю-дән
#Slightly different "interpretations" has the letter 'е'.
#: The difference is that no soft sign will appear when letter 'е' follows 'a' (which denotes that the preceding 'к/г' are uvular) but stands for [ye] (j+front vowel), compare ''гаять'' [ğәyәt] and ''гает'' [ğәyet]<ref>This may be due the fact that there are diffterent "a"s in Arabic words or whatever-language these words come from</ref><s><ref>Similar problem is with 'у' and 'ы' denoting that preceding 'к/г' are uvular, e.g. ''мәлгунь > мәлгун-е-нә'', ''шигый-ләр-гә''</ref></s>
## 'е' at the beginning of the word and there are no other vowels after it<ref name="Vowel follows"/> — 'е' will stand for [yı]: ел-ы, еш-рак
## 'е' following <s>a back vowel</s> letter 'a'<ref>Note again that the term "back vowel" is used here in a graphemic/as-defined-in-twolc sense — 'а' 'ы' 'у' are '''pronounced''' as front vowels in cases described above</ref> — '''can stand for both [yı] and [ye]''': саек > саeг-ырга, җыен-ырга, туен-ырга ''but'' гаеп > гаеб-е, гаеп-ле-ләр
## 'е' following a front vowel (but not 'ү') — stands for [ye]: сөен-ергә
## 'е' following front vowel 'ү' (and any consonant) — stands for [e]: көтү-е


=== Rules needed for this to work ===
=== Twol rules needed for this ===
* "Soft sign deletion before suffix starting with a vowel"
* "Soft sign deletion before suffix starting with a vowel"
**ь:0 <=> _ %>: :Vowel ;
**ь:0 <=> _ %>: :Vowel ;

=Number=

<pre>
{L}{A}р Plural
</pre>

{| class="wikitable" border="1"
|-
! Rules
! surface forms of -/{L}{A}р/
! Examples
! Gloss
|-
| Ends with nasal consonants (м, н, ң)
| -нар/-нәр
| <s>урам+нар, дошман+нар,таң+нар</s>
| streets,enemies,dawns
|-
| Ends with anything else
| -лар/-ләр
| <s>бала+лар,кыз+лар,китап+лар,юләр+ләр</s>
| children,girls,books,fools
|}

; Rules needed for this to work


=Possessives=
=Possessives=
Line 119: Line 114:
|-
|-
| --
| --
| -ы becomes -сы when following vowel
| -ы becomes -сы when following vowel<ref>Or -сы becomes -ы when following consonant, in other words</ref>
| <s>ат+ы</s>,<s>бала+сы</s>
| <s>ат+ы</s>,<s>бала+сы</s>
| his horse, his son
| his horse, his son

Latest revision as of 01:14, 17 March 2012

Vowel harmony[edit]

See Vowel harmony and Wikipedia
Rounding harmony

Rounding harmony is not represented in writing and is therefore not discussed here.

Backness harmony

There are two classes of vowels in Tatar — front and back. Backness harmony states that words may not contain both front and back vowels. This is a very important rule, because not only stems follow it, every affix does. That means, that every affix in Tatar exists in at least two variants — with front vowel and with a back vowel (other variants may exist because of consonant alternations).

But this is the most general rule. There are some exceptions. They are described below and ideas are given, how we can handle them using twolc rules.

Exceptions. Case history[edit]

All exceptions are basically of three types:

  1. Loanwords. They may not follow the backness harmony rule and both front and back vowels can occur in a given word
  2. Also loanwords, but another problem. There are letters in Tatar alphabet, which correspond to more than one consonant: 'к' and 'г'. They can either stand for velar [k] and [g] or for uvular [q] and [ğ] respectively[1]. This doesn't cause many problems in native Tatar words because of another assimilation process, which states that velar variants are used only in words with front vowels and uvular variants only in words with back vowels. Again, this is not true for loan words (Arabic and Farsi). And because there are no special letters for this uvular consonants (they are used in Bashkir e.g.), sometimes 'а', 'ы', 'о' and 'у' are used to denote that the preceding consonant is an uvular one and a soft sign 'ь' at the end of the word denotes that this letters are pronounced as [ә], [е], [ö] and [ü] respectively.
  3. Letters 'я' 'ю' 'е' in initial position. They can stand for both - [ya]/[yә], [yu]/[yü] and [yı]/[ye]

Exceptions. Possible situations[edit]

And now let's describe every imaginable situation — from general cases to more specific ones and give some rules, following which we can decide, what variant of an affix transducer has to choose. This description doesn't try to be minimalistic, no doubt that rules can be implemented in the actual twolc file in a more concise way. Actually, the reason of listing out every possible situation (well, at least most of them) is to help to decide the most beautiful way of implementing them in twolc.

  1. The word has only back vowels or only front vowels: бала-лар-ыбыз-ның, җибәр-гән-нәр-дер
    1. The simplest case. Twolc rule would be — replace archivowels with their front-vowel realization in words with only front vowels
  2. The word has a soft sign 'ь' at the end: табигать, секретарь, гаять
    1. Choose front-vowel variant of affixes no matter what kind of vowels it contain (replace archivowels with their front-vowel variant): секретарь-гә, табигать-кә, шөгыль-ләр-е, гаять тә
    2. delete 'ь' before vowels: табигат-е-нә, секретар-е
  3. Loanwords[2]: икътисад-ы-на, сәркатиб-е-нә
    1. suffix harmonizes with the last syllable: replace archivowels with their front-vowel realizations, in words where the last vowel of the stem is front[3]
  4. Letter 'я'
    1. 'я' at the beginning of the word and there are no other vowels after it[4] and there is no soft sign after it — 'я' will stand for [ya]: ял-ы
    2. 'я' at the beginning of a word and there is a soft sign at the end of the word — it will stand for [yә]: яшь+е > яше
    3. 'я' following a back vowel — stands for [ya]: уян-ырга, (формуляр-ы)
    4. 'я' following a front vowel — stands for [yә]: сөял-ергә
  5. The same will be true for letter 'ю':
    1. 'ю' at the beginning of the word and there are no other vowels after it[4] and there is no soft sign after it — 'ю' will stand for [yu]: ю-ды-м
    2. 'ю' at the beginning of a word and there is a soft sign at the end of the word — it will stand for [yü]: юнь-рәк, юнь+е > юне
    3. 'ю' following a back vowel — stands for [yu]: аю-га
    4. 'ю' following a front vowel — stands for [yü]: сөю-дән
  6. Slightly different "interpretations" has the letter 'е'.
    The difference is that no soft sign will appear when letter 'е' follows 'a' (which denotes that the preceding 'к/г' are uvular) but stands for [ye] (j+front vowel), compare гаять [ğәyәt] and гает [ğәyet][5][6]
    1. 'е' at the beginning of the word and there are no other vowels after it[4] — 'е' will stand for [yı]: ел-ы, еш-рак
    2. 'е' following a back vowel letter 'a'[7]can stand for both [yı] and [ye]: саек > саeг-ырга, җыен-ырга, туен-ырга but гаеп > гаеб-е, гаеп-ле-ләр
    3. 'е' following a front vowel (but not 'ү') — stands for [ye]: сөен-ергә
    4. 'е' following front vowel 'ү' (and any consonant) — stands for [e]: көтү-е

Twol rules needed for this[edit]

  • "Soft sign deletion before suffix starting with a vowel"
    • ь:0 <=> _ %>: :Vowel ;

Possessives[edit]

() = deleted after vowel, [] = deleted after consonant, {} = archivowel/consonant

({I})м,                  1sg
({I})ң,                  2sg
[с]{I},                  3sg 
({I})б{I}з,              1pl
({I})г{I}з,              2pl
[с]{I} or {L}{A}р{I}     3pl

Note: -лары/-ләре, -нары/-нәре can have several meanings:

  • Әниләре өйдә юк. - Their mother is not at home.
  • Аның малайлары да үзенә охшаган. - His sons are just like him. (In this case {L}{A}р{I} is a combination of the plural suffix {L}{A}р and the 3 person singular possessive suffix [с]{I})
  • аларның сүзләре - their words

1st Person Possessives[edit]

Singular/Plural Suffix Example Gloss
Singular -ым becomes -м if ends with vowel ат+ым,бала+м my horse, my son
Plural -ыбыз becomes -быз if ends with vowel ат+ыбыз,бала+быз our horse, our son

2nd Person Possessive[edit]

Singular/Plural Suffix Example Gloss
Singular -ың becomes -ң if ends with vowel ат+ың,бала+ң your horse, your son
Plural -ыгыз becomes -гыз if ends with vowel ат+ыгыз,бала+гыз your horse,your son

3rd Person Possessive[edit]

Singular/Plural Suffix Example Gloss
-- -ы becomes -сы when following vowel[8] ат+ы,бала+сы his horse, his son

General Possessive[edit]

Singular/Plural Suffix Example Gloss
-- -ныкы/-неке бала+ныкы,дәүләтнеке child's,state's, (as their own NPs)

Сases[edit]

Case Name Suffixes
absolute ---
genitive -ның/-нең
dative -га/-гә, -ка/-кә
definite-accusative -ны/-не
ablative -дан/-дән, -тан/-тән, -нан/-нән
locative -да/-дә, -та/-тә

Notes[edit]

  1. In this article instead of phonetic alphabet symbols we use letters from Yaŋalif 2. It has letters for this oppositions and is enough to make it clear
  2. Maybe a better idea is to surrender and make a sublexicon of Arabic and Persian loanwords (at least of ones containing two-sound letters 'я' 'ю' and 'я') with an additional symbol denoting the backness of joined affixes
  3. This won't work for e.g. coциализм > социализм-га
  4. 4.0 4.1 4.2 If there is a vowel our job is already done — it will determine what variant of the suffix to choose
  5. This may be due the fact that there are diffterent "a"s in Arabic words or whatever-language these words come from
  6. Similar problem is with 'у' and 'ы' denoting that preceding 'к/г' are uvular, e.g. мәлгунь > мәлгун-е-нә, шигый-ләр-гә
  7. Note again that the term "back vowel" is used here in a graphemic/as-defined-in-twolc sense — 'а' 'ы' 'у' are pronounced as front vowels in cases described above
  8. Or -сы becomes -ы when following consonant, in other words

See also[edit]