Difference between revisions of "User:Firespeaker/TODO"
Jump to navigation
Jump to search
Hectoralos (talk | contribs) |
Firespeaker (talk | contribs) (→Big) |
||
(52 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
See [[User:Firespeaker/TODO/done|TODO/done]] |
|||
== General == |
|||
* [[Apertium-kaz-kir/TODO]] |
|||
* [[Apertium-tat/TODO]] |
|||
* [[Apertium-kaz/todo]] |
|||
* [[Apertium Turkic/TODO]] |
|||
== Big == |
|||
* Implement productive causative in apertium-kaz |
|||
* Implement productive causative in apertium-tat |
|||
* Implement ifi.evid correctly in <s>apertium-kaz</s>, <s>apertium-kir</s>, apertium-kaa |
|||
* Change {{tag|px}} to {{tag|gen}}{{tag|attr}} and {{tag|gen}}{{tag|subst}} in <s>Kazakh, Kyrgyz,</s> all Turkic |
|||
* Change reflexive pronoun endings to px* forms in all Turkic (uncomment in Kyrgyz, etc.) |
|||
* Figure out [[User:Firespeaker/Kazakh_negatives|Kazakh and Kyrgyz negatives]] |
|||
== To think about == |
|||
* Problems with new build process |
|||
** How can we do single-category testvoc now? |
|||
** How can we make vanilla transducers (without MT-specific "wrong" POSes) |
|||
** How can we count trimmed stems? |
|||
== Things for selimcan == |
== Things for selimcan == |
||
* [[Kazakh and Tatar#Twol related stuff]] |
* [[Kazakh and Tatar#Twol related stuff]] |
||
* [[Apertium-tat/TODO]] |
|||
== Things for spectie == |
== Things for spectie == |
||
* Implement new case/postposition system at [[Morphology_of_Kyrgyz_language#All_cases_table]] |
* Implement new case/postposition system at [[Morphology_of_Kyrgyz_language#All_cases_table]] |
||
** From conversation in logs as Thu 12 Jul 2012 01:35:14 AM EDT |
|||
* <s>Document non-finite verb types at [[Turkic_lexicon#Non-finite_verbs]]</s> |
* <s>Document non-finite verb types at [[Turkic_lexicon#Non-finite_verbs]]</s> |
||
** Integrate Turkish non-finite forms into [[Morphology of Turkish]] and reference it from [[Turkic_lexicon#Non-finite_verbs]] |
** Integrate Turkish non-finite forms into [[Morphology of Turkish]] and reference it from [[Turkic_lexicon#Non-finite_verbs]] |
||
* <s>Get numerals working in Kazakh</s> |
* <s>Get numerals working in Kazakh</s> |
||
* Adjectives in [[Turkic lexicon]]. |
* <s>Adjectives in [[Turkic lexicon]].</s> |
||
⚫ | |||
::: Could we get examples/categories for kir/kaz ? - [[User:Francis Tyers|Francis Tyers]] 17:39, 21 September 2012 (UTC) |
|||
== Things for hector2 == |
== Things for hector2 == |
||
⚫ | |||
* <s>tests/a_0.yaml at apertium-cv-tr : For some reason {а} in the present tense suffix doesn't fall in verbs like вула (ending in а)</s> |
|||
:: |
:: Where is this strange change? The only thing not working for the автан.yaml file right now is a couple dative forms, but this is a widespread problem (see below). —[[User:Firespeaker|Firespeaker]] 03:29, 28 September 2012 (UTC) |
||
⚫ | |||
=== questions/requests for hector2 === |
|||
⚫ | |||
* Could we have pages for the following irregular nouns, especially the first two? —[[User:Firespeaker|Firespeaker]] 08:30, 18 September 2012 (UTC) |
|||
* <s>tests/ger1.yaml at apertium-cv-tr : For some reason м falls in ger1 (<tt>%>м%{А%}</tt>) (and it shouldn't), but м does not fall e.g. in {{tag|neg}}{{tag|pres}} (<tt>%>м%{А%}с%>т</tt>)</s> |
|||
** [[пичче]] |
|||
:: I see nothing wrong in ger1.yaml. If there are forms that aren't working right, could you add them to the yaml file? —[[User:Firespeaker|Firespeaker]] 01:00, 1 August 2012 (UTC) |
|||
** [[аппа]] |
|||
::: Not really. Two forms are generated, I don't know why. One is the good one, but the other (without м) is odd: |
|||
** [[кӗрӳ]] |
|||
:::[PASS] вула<v><tv><ger1> => вулама |
|||
** [[мучи]] |
|||
:::[FAIL] вула<v><tv><ger1> => unexpected results: вулаа |
|||
** [[кинемей]] |
|||
:::--[[User:Hectoralos|Hèctor Alòs i Font]] 05:29, 1 August 2012 (UTC) |
|||
** [[кукаҫи]] |
|||
::::The problem is in lexc. I don't know why yet there's a mix between ger1 and ger10, but there's nothing to do with twol.--[[User:Hectoralos|Hèctor Alòs i Font]] 10:15, 7 August 2012 (UTC) |
|||
** [[ен]] |
|||
* <s>tests/пенсионер.yaml : After the 3rd person affix (ӗ) front vowels should be used, but the {RUS} tag blocks the vowel harmony for the whole word. It should block it only until {ӗ} is found</s> |
|||
** [[хӗрри]] |
|||
:: Genitive fixed in <tt>r40191</tt>, still working on dative —[[User:Firespeaker|Firespeaker]] 06:19, 9 August 2012 (UTC) |
|||
* Could I have yaml files for the following words? —[[User:Firespeaker|Firespeaker]] 08:46, 18 September 2012 (UTC) |
|||
** <s>[[йӗп]]</s> |
|||
::: Great! --[[User:Hectoralos|Hèctor Alòs i Font]] 10:04, 15 August 2012 (UTC) |
|||
** <s>[[пуртӑ]]</s> |
|||
* <s>tests/кала.yaml : In some tenses the last vowel of the root falls, even if it's а or e and the tense (or person) affix begins with {Ӑ}. In order to avoid new archiphonemes that would have to be added in zillions of rules, I've created a new pseudoarchiphoneme {del2}, similar to {del}. A couple of rules in twol should do all the work (search del2). The problem is that when these two rules are uncommented all px3sp words are not recognized, nor the verbal forms with del2 are generated. The compiler doesn't show any rule conflict and I can't see it either. In the кала.yaml a few px3sp forms are added to be able to follow better the problem. There's also a file tests/кил.yaml which works well, as the verb doesn't end in a vowel.<br>I put the problem in the first place of the pending list, as it affects lots of words.</s> |
|||
** <s>[[атте]]</s> |
|||
** I finally decided that's cleaner to create an archiphoneme {Ӑ2} (which is needed also for px2pl) and I'm working on it.--[[User:Hectoralos|Hèctor Alòs i Font]] 07:27, 23 August 2012 (UTC) |
|||
** [[пӳ]] |
|||
* tests/чикӗ.yaml, tests/училище.yaml : There are strange errors in the vowel harmony for some cases in the px2sg. In some cases the rule "Vowel harmony for archiphoneme {У}" is correctly applied, but in some others no. I've tried to add a couple of lines (search "училище" in the twol file), but they didn't solve the problem. |
|||
** [[лаша]] |
|||
* tests/шӑши.yaml: An epenthetic й must appear in most of the person forms of the word ending in и. The problem is that also px3sg forms of words ending in short vowel (tests/чикӗ.yaml) and Russian words ending in и or ий (tests/информаци.yaml, tests/Золотницкий.yaml) must be considered (информаци and шӑши are different). |
|||
** <s>[[утӑ]]</s> |
|||
* tests/Золотницкий.yaml: An epenthetic й must appear in px2pl (probably, in the rules relating to %{Ӑdel%}, %{й%} must be also considered). |
|||
*** You can use the script tests/gen_yaml_mot.sh, which creates a yaml for a given noun from the wiki.yaml file (which I regularly read from the wiki, when I change something in it). As above, I still have some doubts about px2sg.loc and px2sg.abl, but all other forms should be correct. Атте should be correct in all forms.--[[User:Hectoralos|Hèctor Alòs i Font]] 19:07, 18 September 2012 (UTC) |
|||
* tests/хӑю.yaml : As in tests/ту.yaml (which works perfectly), there is a у/ӑв variation in this word. The problem is that (because of the Russian orthography) ю has to be split and an inexistent position for й has to be found. That's why, for instance, adding ю in the rule "в surfaces in у/ӳ > ӑв/ӗв before vowel (2)" may not solve part of the problem. A solution can be adding something at the end of this kind of words in lexc, but that may give problems in twol (fortunately there is not vowel harmony in this case). A very dirty trick could be use the morpheme boundary symbol for that. |
|||
**** See above about px2sg (by the way, I see that both атте and анне have the forms px2sg.loc and px2sg.abl which I corrected everywhere. As they are irregular words I don't dare correct them... It may be said that really in these words personal suffixes are used, so this gives some more confidence).--[[User:Hectoralos|Hèctor Alòs i Font]] 19:03, 20 September 2012 (UTC) |
|||
* I don't understand what the rule is for when {У} deletes in {{tag|px2sg}}{{tag|dat}}. Do you know what the rule is? —[[User:Firespeaker|Firespeaker]] 03:29, 28 September 2012 (UTC) |
|||
== General cv.twol TODO list == |
|||
* <s>gemination</s> |
|||
* <s>ӳ:ӗв, у:ӑв</s> |
|||
* [[Morphology_of_Chuvash#Nouns_ending_in_.E2.80.B9.D0.BE.E2.80.BA|Nouns ending in о]] |
|||
* {{tag|px2sg}}{{tag|dat}} of nouns |
|||
* "Irregular" nouns (family relations that have different stems):w |
|||
* clean up twol conflicts |
|||
[[Category:TODO lists]] |
Latest revision as of 21:32, 19 August 2015
See TODO/done
Contents
General[edit]
Big[edit]
- Implement productive causative in apertium-kaz
- Implement productive causative in apertium-tat
- Implement ifi.evid correctly in
apertium-kaz,apertium-kir, apertium-kaa - Change
<px>
to<gen>
<attr>
and<gen>
<subst>
inKazakh, Kyrgyz,all Turkic - Change reflexive pronoun endings to px* forms in all Turkic (uncomment in Kyrgyz, etc.)
- Figure out Kazakh and Kyrgyz negatives
To think about[edit]
- Problems with new build process
- How can we do single-category testvoc now?
- How can we make vanilla transducers (without MT-specific "wrong" POSes)
- How can we count trimmed stems?
Things for selimcan[edit]
Things for spectie[edit]
- Implement new case/postposition system at Morphology_of_Kyrgyz_language#All_cases_table
- From conversation in logs as Thu 12 Jul 2012 01:35:14 AM EDT
Document non-finite verb types at Turkic_lexicon#Non-finite_verbs- Integrate Turkish non-finite forms into Morphology of Turkish and reference it from Turkic_lexicon#Non-finite_verbs
Get numerals working in KazakhAdjectives in Turkic lexicon.
- Good enough? —Firespeaker 15:52, 21 September 2012 (UTC)
- Could we get examples/categories for kir/kaz ? - Francis Tyers 17:39, 21 September 2012 (UTC)
- Good enough? —Firespeaker 15:52, 21 September 2012 (UTC)
Things for hector2[edit]
- Vowel harmony: tests/автан.yaml. There is a strange change of %{Ӑ%} to и. --Hèctor Alòs i Font 19:05, 21 September 2012 (UTC)
- Where is this strange change? The only thing not working for the автан.yaml file right now is a couple dative forms, but this is a widespread problem (see below). —Firespeaker 03:29, 28 September 2012 (UTC)
questions/requests for hector2[edit]
- Could we have pages for the following irregular nouns, especially the first two? —Firespeaker 08:30, 18 September 2012 (UTC)
- Could I have yaml files for the following words? —Firespeaker 08:46, 18 September 2012 (UTC)
йӗппуртӑатте- пӳ
- лаша
утӑ- You can use the script tests/gen_yaml_mot.sh, which creates a yaml for a given noun from the wiki.yaml file (which I regularly read from the wiki, when I change something in it). As above, I still have some doubts about px2sg.loc and px2sg.abl, but all other forms should be correct. Атте should be correct in all forms.--Hèctor Alòs i Font 19:07, 18 September 2012 (UTC)
- See above about px2sg (by the way, I see that both атте and анне have the forms px2sg.loc and px2sg.abl which I corrected everywhere. As they are irregular words I don't dare correct them... It may be said that really in these words personal suffixes are used, so this gives some more confidence).--Hèctor Alòs i Font 19:03, 20 September 2012 (UTC)
- You can use the script tests/gen_yaml_mot.sh, which creates a yaml for a given noun from the wiki.yaml file (which I regularly read from the wiki, when I change something in it). As above, I still have some doubts about px2sg.loc and px2sg.abl, but all other forms should be correct. Атте should be correct in all forms.--Hèctor Alòs i Font 19:07, 18 September 2012 (UTC)
- I don't understand what the rule is for when {У} deletes in
<px2sg>
<dat>
. Do you know what the rule is? —Firespeaker 03:29, 28 September 2012 (UTC)
General cv.twol TODO list[edit]
geminationӳ:ӗв, у:ӑв- Nouns ending in о
<px2sg>
<dat>
of nouns- "Irregular" nouns (family relations that have different stems):w
- clean up twol conflicts