Turkic lexicon

Layout

General points:

The lexicon will be made in one file, it will have the suffix .lexc
The file will be laid out in the following order:
1. The multicharacter symbols
2. The Root lexicon, pointing to the stem lexicons
3. The morphotactics (continuation lexica)
4. The stem lexicons

Multicharacter symbols

Morphological categories must be encased in < and > tags. They may contain the letters a-z and numbers 0-9. In extreme cases they may include the letters A-Z They must begin with a letter, they may not begin with a number.

Examples:

%<n%> Noun
%<p3%> Third person
%<evid%> Evidential

For information on archiphonemes, see the corresponding page.

The list of symbols should be laid out in the following order:

The major parts of speech
The morphological categories
Archiphonemes
Other symbols, e.g. Morpheme boundary, ' ', '-' etc.

Every symbol should have a comment. The comments should line up.

Morphotactics

Naming continuation lexica

Continuation lexica will be named in upper case, and may contain letters, numbers and the symbol -.
- Examples: LEXICON N1, LEXICON DET-DEM, LEXICON ADV

What sorts of distinctions to make

TODO: TV vs. IV, Russian vs. non-Russian in Chuvash

Stem lexicons

TODO: Why stems go in lexicon and not infinitives

Lines in the stem lexicons should follow the following pattern:

Left side (lexical form)
Colon :
Right side (surface form)
Space
Continuation lexicon
Space
Semicolon ;
Space
Exclamation mark
Open quote "
Gloss (optional)
Close quote "

Example:

кӗнеке:кӗнек N2 ; ! "llibre, книга"

Morphophonology

TODO: px3 is sIn (and why)

Categorisation

Nominals

Compound Nouns

TODO: N-N compounds with <px3>

Adjectives

A1: adjectives that can be both substantivised and adverbialised;
- All three readings (<adj>, <adj.subst> and <adj.advl>)
- have comparison levels.
A2: derived/not fully lexicalised adjectives without adverbial reading
- <adj> and <adj.subst> readings
- have comparison levels.
A3: derived/not fully lexicalised adjectives without adverbial reading
- so-called "predicatives" (бар, жоқ)
- no comparison levels at all.
A4: "pure" adjectives
- no adverbial and substantive readings,
- no comparison levels;

Type	Language	Example	Reading	Phrase
A1	Chuvash	лайӑх	`<adj>`	Ку лайӑх кĕнеке.
		лайӑхтӑрӑх	`<adj><comp>`	Ку лайӑхтӑрӑхче.
		лайӑх	`<adj><advl>`	Вӑл лайӑх иҫет.
		лайӑхисем	`<adj><subst><pl>`
A2	Chuvash	кӑвак	`<adj>`
		кӑвакрӑх	`<adj><comp>`
		*кӑвак	`<adj><advl>`
		кӑвак	`<adj><subst><pl>`
A3	Chuvash	вилĕ	`<adj>`
		вилĕрӑх, вилĕтĕрĕх	`<adj><comp>`
		*вилĕ	`<adj><advl>`
		вилĕ	`<adj><subst><pl>`
A4	Chuvash	тĕп	`<adj>`
		тĕпрĕх, тĕптĕрĕх	`<adj><comp>`	—
		*тĕп	`<adj><advl>`	—
		*тĕп	`<adj><subst>`	—

Adverbs

Postpositions

TODO: "postpositions" which take poss./case are nouns

Finite verbs

Non-finite verbs

This section outlines what categories of non-finite verb forms exist in Turkic, and how to identify the type of category created by a given affix.

Language specific issues

Turkmen: stem-final voiced and voiceless stops

In Turkmen, there are three types of stem-final stops:

voiced stops
voiceless stops
stops that are voiceless syllable finally and voiced intervocalically

TODO: finish description of this and explain how it can be / is dealt with

Chuvash: Russian loans ending in -a with non-final stress

Turkic lexicon

Contents

Layout

Multicharacter symbols

Morphotactics

Naming continuation lexica

What sorts of distinctions to make

Stem lexicons

Morphophonology

Categorisation

Nominals

Compound Nouns

Adjectives

Adverbs

Postpositions

Finite verbs

Non-finite verbs

Language specific issues

Turkmen: stem-final voiced and voiceless stops

Chuvash: Russian loans ending in -a with non-final stress

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools