Difference between revisions of "Apertium-kaz"

From Apertium
Jump to navigation Jump to search
Line 12: Line 12:
== Current State ==
== Current State ==
* Number of stems: {{:Kazmorph/stems}}
* Number of stems: {{:Kazmorph/stems}}
* Coverage: {{:Kazmorph/coverage/average}}
* Coverage: {{:Kazmorph/coverage/Әуезов}} ([[Әуезов corpus]]), {{:Kazmorph/coverage/bible}} (bible), {{:Kazmorph/coverage/rferl2010}} ([[RFERL corpora|azattyq]] 2010)

{| class="wikitable"
|-
! corpus !! words !! coverage
|-
|[[Әуезов corpus|Әуезов]]
|align="right"|155K
| {{:Kazmorph/coverage/Әуезов}}
|-
| bible
|align="right"| {{:bible corpora/kk/stems}}
| {{:Kazmorph/coverage/bible}}
|-
| [[RFERL corpora|azattyq]] 2010
|align="right"| {{:RFERL corpus/kk/2010/stems}}
| {{:Kazmorph/coverage/rferl2010}}
|-
|wp 2011-11
|align="right"| 0.84M
| {{:Kazmorph/coverage/wp}}
|-
|}


== To-do ==
== To-do ==

Revision as of 19:43, 30 December 2011

Kazmorph is a morphological analyser/generator for Kazakh, currently under development. It is intended to be compatible with transducers for other Turkic languages so that they can be translated between.

Installation

kazmorph is currently located in ky-kk.

Dependency tree

  • hfst (svn ≥r1916)
    • foma

Current State

  • Number of stems: 9,306
  • Coverage: 94.1
corpus words coverage
Әуезов 155K 83.2
bible 577K 85.5
azattyq 2010 3.2M 85.4
wp 2011-11 0.84M 79.6

To-do

Improve coverage

  • Causitives
  • collective numbers
  • fix demonstratives
  • vowel harmony of single-syllable words with у and и

Future

  • run tests on morphophonology