Difference between revisions of "Alternation"

From Apertium
Jump to navigation Jump to search
(Category:Documentation in English)
 
(2 intermediate revisions by one other user not shown)
Line 46: Line 46:
</pre>
</pre>


This would over analyse, because '''gulli''' is not a valid form in the language.
This would over analyse, because '''gulli''' is not a valid form in the language. It would also be possible to split the paradigm, and have two entries in the main section. This would be ugly as we usually consider the main section to have one entry per lemma.


<pre>
<pre>
Line 59: Line 59:
* [[SFST]] and [[Omorfi]] (a free Finnish morphological analyser)
* [[SFST]] and [[Omorfi]] (a free Finnish morphological analyser)
==External links==

* [http://en.wikipedia.org/wiki/Alternation Wikipedia: Alternation]

[[Category:Development]]
[[Category:Development]]
[[Category:Documentation in English]]

Latest revision as of 11:26, 24 March 2012

In some languages, in certain declensions part of the stem changes. This change may be completely regular, but at the moment we cannot model it nicely with Apertium monodices. Examples of this are: umlaut, consonant gradation, diphthong simplification, etc.

Example[edit]

Here is an example of consonant gradation and diphthong simplification in North Sámi, the plural forms with the exception of the nominative reduce the dipthong "uo" to "u". In Apertium dictionaris as they are, this would involve cutting the paradigm at "g". This substantially limits the generalisation power of paradigms.

guolli 	N+Sg+Nom 	guolli
guolli 	N+Sg+Gen 	guoli 	guole
guolli 	N+Sg+Acc 	guoli
guolli 	N+Sg+Ill 	guollái
guolli 	N+Sg+Loc 	guolis
guolli 	N+Sg+Com 	guliin
guolli 	N+Pl+Nom 	guolit
guolli 	N+Pl+Gen 	guliid
guolli 	N+Pl+Acc 	guliid
guolli 	N+Pl+Ill 	guliide
guolli 	N+Pl+Loc 	guliin
guolli 	N+Pl+Com 	guliiguin

We can take care of this with over analysis, which would mean basically analysing both "uo" and "u" as valid for all declensions. For example:

  <pardefs>
    <pardef n="guol/li__n">
      <e>
        <p>  
          <l>li</l>
          <r>li<s n="n"/><s n="sg"/><s n="nom"/></r>
        </p>
      </e>
      ... 
    </pardef>
    <pardef n="u_uo"> 
      <e><p><l>u</l><r>uo</r></p></e>
      <e><p><l>uo</l><r>uo</r></p></e>
    </pardef>
  </pardefs>
  <section id="main" type="standard">
    <e lm="guolli">
      <i>g</i><par n="u_uo"/><i>l</i>
      <par n="guol/li__n"/>
    </e> 
  </section>

This would over analyse, because gulli is not a valid form in the language. It would also be possible to split the paradigm, and have two entries in the main section. This would be ugly as we usually consider the main section to have one entry per lemma.

guolli:guolli<n><sg><nom>
gulli:guolli<n><sg><nom>

However, over analysis is ugly and it would be nice to have a way to restrict a change based on the following tags, discarding impossible paths. Taking care of this would probably involve both a change to the format of the dictionaries and to the analyser. We welcome suggestions!

See also[edit]

  • SFST and Omorfi (a free Finnish morphological analyser)

External links[edit]