Difference between revisions of "Vowel harmony"

From Apertium
Jump to navigation Jump to search
 
(2 intermediate revisions by 2 users not shown)
Line 15: Line 15:
 
|}
 
|}
   
  +
If the stem had ended with another vowel than ə, the underscored vowel would differ as well (e.g. a and a instead of ə and ə).
This will pose a problem for both analysis and generation of word forms. In analysis it is possible to ''overanlayse'' words, e.g. say have a paradigm for "a → e" for the plural ending ''-ler'', which would accept both ''-ler'' and ''-lar''. Then we would analyse both the correct form: ''biralar'' and an incorrect form ''biraler''. This causes problems because of ambiguity (we shouldn't be analysing non-existant words!), especially on short words. It remains to be seen if this ambiguity will be too great.
 
   
  +
[[HFST]] has a handy way of treating this phenomena. Represent the harmonising vowel as e.g. {A} (an ''archiphoneme'' that represents e.g. the set of a or ə) in the suffix, and use a two-level rule to instantiate it to a specific vowel when it's attached to the stem.
One example of ambiguity would be with the word for "book", ''kitab''. The form ''kitabı'' means "his book", but the form ''kitabi'' (or ''kitabî'') means "bookish". This should not be too much of a problem as the two are different parts of speech and should be taken care of in the tagging stage.
 
 
The other problem is generation, we do not currently have a way in apertium to enforce vowel harmony, it may be possible to use an alternate spell-checker to do this (e.g. <code>hunspell</code> has specialised algorithms for both Azerbaijani and Turkish, or possible we could use post-gen or write a new post-gen module for this.
 
   
 
==See also==
 
==See also==
Line 27: Line 25:
 
[[Category:Development]]
 
[[Category:Development]]
 
[[Category:Writing dictionaries]]
 
[[Category:Writing dictionaries]]
  +
[[Category:Documentation in English]]

Latest revision as of 07:09, 20 October 2014

Both Turkish and Azerbaijani, along with most other Turkic languages exhibit vowel harmony. See the following table of inflections for the word pivə, "beer" in Azerbaijani. Underscore indicates a vowel that has been "harmonised".

Azerbaijani Gloss
pivə beer
pivəler beers
pivəlerim my beers
pivədən from beer
pivələrdən from beers

If the stem had ended with another vowel than ə, the underscored vowel would differ as well (e.g. a and a instead of ə and ə).

HFST has a handy way of treating this phenomena. Represent the harmonising vowel as e.g. {A} (an archiphoneme that represents e.g. the set of a or ə) in the suffix, and use a two-level rule to instantiate it to a specific vowel when it's attached to the stem.

See also[edit]