Difference between revisions of "Turkic languages"

Revision as of 13:18, 14 December 2011

Status

Once a transducer has ~80% coverage on a range of corpora we can say it is "working". Over 90% and it can be considered to be "production".

Transducers

name	Language	ISO 639		formalism	state	stems	coverage			location	primary authors
name	Language	-2	-3	formalism	state	stems	corpus	words	%cov	location	primary authors
trmorph	Turkish	`tr`	`tur`	SFST	working	42,827	SETimes	4.1M	~88%		Çağri
kymorph	Kyrgyz	`ky`	`kir`	HFST (lexc+twol)	working	8,555	azattyk 2010	3.4M	~87%	trunk/apertium-tr-ky	Jonathan, Mirlan, Fran
turmorph	Turkish	`tr`	`tur`	HFST (lexc+twol)	development	18,227	SETimes	4.1M	~76%		Gianluca
kazmorph	Kazakh	`kk`	`kaz`	HFST (lexc+twol)	development	9,306	Әуезов	147.5K	83.2	incubator/apertium-ky-kk	Nathan, Jonathan, Fran
kazmorph	Kazakh	`kk`	`kaz`	HFST (lexc+twol)	development	9,306	wp 2011-11	0.84M	~65%	incubator/apertium-ky-kk	Nathan, Jonathan, Fran
	Chuvash	`cv`	`chv`	HFST (lexc+twol)	development	88		88.8K	~30%	incubator/apertium-cv-ru	Hèctor
	Tatar	`tt`	`tat`						-
azmorph	Azerbaijani	`az`	`aze`	SFST	working?				-	trunk/apertium-tr-az	Gianluca

Turkic-Turkic pairs

Text in italic denotes language pairs under development / in the incubator. Regular text denotes a functioning language pair in trunk, while text in bold denotes a stable well-working language pair.

	tr	az	tk	uz	ky	kk	tt	cv	ba	ug
tr	—	tr-az			tr-ky			tr-cv
az	az-tr	—
tk			—
uz				—
ky	ky-tr				—		ky-kk
kk						—	kk-tt
tt						tt-kk	—		tt-ba
cv	cv-tr							—
ba									—
ug										—

Pairs with non-Turkic languages

	tr	ky	kk	cv
en	tr-en	ky-en
fr
es
it
ru				cv-ru
mn			mn-kk

Tagset

Rough guide to tagsets in various Turkic language transducers, with an eye to keeping stuff that is basically the same tagged the same. In the following table, ^A stands for Apertium and ^T stands for TRmorph.

Phenomenon	Morphology	Description	Tag(s)	Language(s)
Case
Nominative case			`<nom>`
Genitive case			`<gen>`
Dative case			`<dat>`
Locative case			`<loc>`
Ablative case	-DAn	Case indicating movement away	`<abl>`	Pan-turkic
Instrumental case			`<ins>`
Private case			`<priv>`
Terminative case			`<term>`
Final case			`<fin>`
Posession
1st pers sg			`<px1sg>`
1st pers pl			`<px1pl>`
2nd pers sg			`<px2sg>`
2nd pers pl			`<px2pl>`
3rd pers			`<px3sp>`
Number
plural			`<pl>`
Tense, aspect, mood
Imperative	-ø	Mood for giving orders	`<imp>`^A, `<t_imp>`^T	Pan-turkic

@@ Line 185: / Line 185: @@
 |-
 |colspan=5 align="center"|'''Case'''
+|-
+| Nominative case || ||  || {{tag|nom}} ||  ||
+|-
+| Genitive case || ||  || {{tag|gen}} ||  ||
+|-
+| Dative case || ||  || {{tag|dat}} ||  ||
+|-
+| Locative case || ||  || {{tag|loc}} ||  ||
 |-
 | Ablative case || -DAn || Case indicating movement away  || {{tag|abl}} || Pan-turkic ||
+|-
+| Instrumental case || ||  || {{tag|ins}} ||  ||
+|-
+| Private case || ||  || {{tag|priv}} ||  ||
+|-
+| Terminative case || ||  || {{tag|term}} ||  ||
+|-
+| Final case || ||  || {{tag|fin}} ||  ||
+|-
+|colspan=5 align="center"|'''Posession'''
+|-
+| 1st pers sg || ||  || {{tag|px1sg}} ||  ||
+|-
+| 1st pers pl || ||  || {{tag|px1pl}} ||  ||
+|-
+| 2nd pers sg || ||  || {{tag|px2sg}} ||  ||
+|-
+| 2nd pers pl || ||  || {{tag|px2pl}} ||  ||
+|-
+| 3rd pers || ||  || {{tag|px3sp}} ||  ||
+|-
+|colspan=5 align="center"|'''Number'''
+|-
+| plural || ||  || {{tag|pl}} ||  ||
 |-
 |colspan=5 align="center"|'''Tense, aspect, mood'''

Difference between revisions of "Turkic languages"

Revision as of 13:18, 14 December 2011

Contents

Status

Transducers

Turkic-Turkic pairs

Pairs with non-Turkic languages

Tagset

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools