Difference between revisions of "Apertium-aze"
(Created page with 'Azmorph, a morphological analyzer for Azerbaijani') |
Firespeaker (talk | contribs) (→Current State: bah) |
||
(33 intermediate revisions by 6 users not shown) | |||
Line 1: | Line 1: | ||
{{TOCD}} |
|||
Azmorph, a morphological analyzer for Azerbaijani |
Azmorph, a morphological analyzer for Azerbaijani |
||
Azmorph current version is 0.2.1 (preALPHA) |
|||
==What it is, what it does and what it does not == |
|||
Azmorph is a morphological analyzer for Azerbaijani (Azerbaycan dili). |
|||
Due to the similarities between Azerbaijani and Turkish Azmorph is being developed starting from TRmorph of Çağrı Çöltekin. |
|||
Azmorph is at preALPHA stage of developement: this means that it works for very few features of the language (which will be explained later). If you are not familiar with the nerdish jargon, preALPHA means roughly that the software is in its embryonal state. Is it already a life? We don't know, better you consult your local Church. What we know is that, beside being a problematic embryo we decided to keep it and try to provide it with a decent development. We know it will be a problematic child, probably with several impairments, but we decided to keep it anyway. |
|||
== Current State == |
|||
{{LangStats | lang = aze | corpus1 = azadliq2012 | corpus2 = quran | corpus3 = udhr }} |
|||
==What works? What does not?== |
|||
{| class="wikitable" style="text-align: center; width: 800px; height: 700px;" |
|||
|+ '''Verbal moods and tenses''' |
|||
|- |
|||
! scope="col" | |
|||
! scope="col" | Works |
|||
! scope="col" | Minor Problems |
|||
! scope="col" | Absent |
|||
|- |
|||
! scope="row" | Present Progressive (alıram) |
|||
| || negative doesn't work well || |
|||
|- |
|||
! scope="row" | Imperative |
|||
| Works! || || |
|||
|- |
|||
! scope="row" | Future indicative (alacağım) |
|||
| || 1p and 1s devoicing || |
|||
|- |
|||
! scope="row" | Evidential/Past perfect (almışam) |
|||
| Works! || || |
|||
|- |
|||
! scope="row" | Indefinite future / Aorist (alaram) <t_aor> |
|||
| Works! || || |
|||
|- |
|||
! scope="row" | Optative present(alam) |
|||
| Works! || || |
|||
|- |
|||
! scope="row" | Optative past (ala idim) |
|||
| Works! || || |
|||
|- |
|||
! scope="row" | Necessitative present (almalıyım) |
|||
| Works || || |
|||
|- |
|||
! scope="row" | Necessitative past (almalı idim) |
|||
| Works || || |
|||
|- |
|||
! scope="row" | Abilitative(bil-) |
|||
| || has to be split || |
|||
|- |
|||
! scope="row" | i- copula (idim) |
|||
| Works || || |
|||
|- |
|||
|} |
|||
{| class="wikitable" style="text-align: center; width: 800px; height: 700px;" |
|||
|+ '''Noun Inflection''' |
|||
|- |
|||
! scope="col" | |
|||
! scope="col" | Works |
|||
! scope="col" | Minor Problems |
|||
! scope="col" | Absent |
|||
|- |
|||
! scope="row" | Cases (n, g, d, acc, abl, loc) |
|||
| Works! || || |
|||
|- |
|||
! scope="row" | Number (-lAr) |
|||
| Works! || || |
|||
|- |
|||
! scope="row" | -l<I><Q> |
|||
| || Devoicing || |
|||
|- |
|||
! scope="row" | -L<A> |
|||
| Works! || || |
|||
|- |
|||
! scope="row" | -L<I> |
|||
| Works! || || |
|||
|- |
|||
! scope="row" | -C<A> (makes things like italyanca, inglizce) |
|||
| Works! || || |
|||
|- |
|||
! scope="row" | Possessives |
|||
| Works! || || |
|||
|- |
|||
! scope="row" | -ki realized as k<I> |
|||
| Works! || || |
|||
|- |
|||
|} |
|||
== Known problems == |
|||
===Phonology=== |
|||
# <Q> should be replaced by <q> and <k>, and not simply by "q" and "k" |
|||
# Devoicing should be expanded, adding <q> |
Latest revision as of 06:09, 6 December 2013
Azmorph, a morphological analyzer for Azerbaijani
Azmorph current version is 0.2.1 (preALPHA)
What it is, what it does and what it does not[edit]
Azmorph is a morphological analyzer for Azerbaijani (Azerbaycan dili).
Due to the similarities between Azerbaijani and Turkish Azmorph is being developed starting from TRmorph of Çağrı Çöltekin.
Azmorph is at preALPHA stage of developement: this means that it works for very few features of the language (which will be explained later). If you are not familiar with the nerdish jargon, preALPHA means roughly that the software is in its embryonal state. Is it already a life? We don't know, better you consult your local Church. What we know is that, beside being a problematic embryo we decided to keep it and try to provide it with a decent development. We know it will be a problematic child, probably with several impairments, but we decided to keep it anyway.
Current State[edit]
{{#set_param_default | corpus1 | None }} {{#set_param_default | corpus2 | None }} {{#set_param_default | corpus3 | None }} {{#set_param_default | corpus4 | None }} {{#set_param_default | corpus5 | None }} {{#set_param_default | corpus6 | None }} {{#set_param_default | corpus7 | None }} {{#set_param_default | corpus8 | None }} {{#set_param_default | corpus9 | None }} {{#set_param_default | corpus10 | None }}
- Number of stems: 11,120 {{#ifneq | | | () }}
- Disambiguation rules:
- Coverage: ~Apertium-aze/stats/average%
{{#ifneq | azadliq2012 | None |
{{#ifneq | RFERL corpora | | | }}}}
{{#ifneq | quran | None |
{{#ifneq | | | | }}}}
{{#ifneq | udhr | None |
{{#ifneq | | | | }}}}
{{#ifneq | {{{corpus4}}} | None |
{{#ifneq | | | | }}}}
{{#ifneq | {{{corpus5}}} | None |
{{#ifneq | | | | }}}}
{{#ifneq | {{{corpus6}}} | None |
{{#ifneq | | | | }}}}
{{#ifneq | {{{corpus7}}} | None |
{{#ifneq | | | | }}}}
{{#ifneq | {{{corpus8}}} | None |
{{#ifneq | | | | }}}}
{{#ifneq | {{{corpus9}}} | None |
{{#ifneq | | | | }}}}
{{#ifneq | {{{corpus10}}} | None |
{{#ifneq | | | | }}}}
corpus | words | coverage | |
---|---|---|---|
<nowinter>azadliq2012</nowinter> | azadliq2012 | 2.2M | ~-% |
<nowinter>[[|quran]]</nowinter> | quran | 153K | ~-% |
<nowinter>[[|udhr]]</nowinter> | udhr | 1.5K | ~-% |
<nowinter>[[|{{{corpus4}}}]]</nowinter> | {{{corpus4}}} | ~% | |
<nowinter>[[|{{{corpus5}}}]]</nowinter> | {{{corpus5}}} | ~% | |
<nowinter>[[|{{{corpus6}}}]]</nowinter> | {{{corpus6}}} | ~% | |
<nowinter>[[|{{{corpus7}}}]]</nowinter> | {{{corpus7}}} | ~% | |
<nowinter>[[|{{{corpus8}}}]]</nowinter> | {{{corpus8}}} | ~% | |
<nowinter>[[|{{{corpus9}}}]]</nowinter> | {{{corpus9}}} | ~% | |
<nowinter>[[|{{{corpus10}}}]]</nowinter> | {{{corpus10}}} | ~% |
What works? What does not?[edit]
Works | Minor Problems | Absent | |
---|---|---|---|
Present Progressive (alıram) | negative doesn't work well | ||
Imperative | Works! | ||
Future indicative (alacağım) | 1p and 1s devoicing | ||
Evidential/Past perfect (almışam) | Works! | ||
Indefinite future / Aorist (alaram) <t_aor> | Works! | ||
Optative present(alam) | Works! | ||
Optative past (ala idim) | Works! | ||
Necessitative present (almalıyım) | Works | ||
Necessitative past (almalı idim) | Works | ||
Abilitative(bil-) | has to be split | ||
i- copula (idim) | Works |
Works | Minor Problems | Absent | |
---|---|---|---|
Cases (n, g, d, acc, abl, loc) | Works! | ||
Number (-lAr) | Works! | ||
-l |
Devoicing | ||
-L<A> | Works! | ||
-L | Works! | ||
-C<A> (makes things like italyanca, inglizce) | Works! | ||
Possessives | Works! | ||
-ki realized as k | Works! |
Known problems[edit]
Phonology[edit]
should be replaced by
and <k>, and not simply by "q" and "k"
- Devoicing should be expanded, adding