Difference between revisions of "Kazakh and Tatar/Diary"
m |
|||
Line 1: | Line 1: | ||
== Monday, 28th May 2012 == |
|||
=== Checking & refactoring clitics === |
|||
Some of the clitics appear only after certain forms (e.g. "шы<mod_foc>" in Kazakh, which expresses politeness, joins only 2nd person singular). And vice versa - some of the forms can get only certain clitics (imperative forms get only "чы" and "сана" in Tatar) |
Some of the clitics appear only after certain forms (e.g. "шы<mod_foc>" in Kazakh, which expresses politeness, joins only 2nd person singular). And vice versa - some of the forms can get only certain clitics (imperative forms get only "чы" and "сана" in Tatar) |
||
Line 9: | Line 9: | ||
In Tatar some new clitics were added as well. |
In Tatar some new clitics were added as well. |
||
== Tuesday, 29th May 2012 == |
|||
=== Checking & refactoring clitics (cont.) === |
|||
A question whether <code>%+ғана%<postadv%>:% %{G%}ана # ; ! "only"</code> in <code>CLIT</code> continuation class was correct produced a discussion about whether we should handle harmonizing of such words in transducer (means matching them to the previous word) or post-generator can take care of that. |
A question whether <code>%+ғана%<postadv%>:% %{G%}ана # ; ! "only"</code> in <code>CLIT</code> continuation class was correct produced a discussion about whether we should handle harmonizing of such words in transducer (means matching them to the previous word) or post-generator can take care of that. |
||
Line 19: | Line 19: | ||
I learned a lot of new stuff :), but the possible changes in CLIT lexicon were kept for later. |
I learned a lot of new stuff :), but the possible changes in CLIT lexicon were kept for later. |
||
=== Some work on postadverbs === |
|||
See [[../Postadvebs|Postadverbs]] |
See [[../Postadvebs|Postadverbs]] |
||
== Wednesday, 30th May 2012 == |
|||
Had to study for a "zachet", not much done, but: |
|||
=== Went over numerals again, some additions === |
|||
=== Started categorizing postpositions depending on what case they govern === |
|||
Their "case-governance" often mismatches between the the two languages, so some transfer rules will be required. |
|||
I'll need help to set up coverage-measuring scripts and to learn how I can testvoc only certain POS's. |
|||
Also I think that I need another story :) To keep testing things on a parallel text much earlier than midterm comes is a good idea anyway. |
Revision as of 23:52, 30 May 2012
Monday, 28th May 2012
Checking & refactoring clitics
Some of the clitics appear only after certain forms (e.g. "шы<mod_foc>" in Kazakh, which expresses politeness, joins only 2nd person singular). And vice versa - some of the forms can get only certain clitics (imperative forms get only "чы" and "сана" in Tatar)
I moved the above clitics into a separate lexicon, and linked imperative forms to it, so that there is no overgeneration now (and a bit easier life for spectie's "testvocing" PC's).
In Tatar some new clitics were added as well.
Tuesday, 29th May 2012
Checking & refactoring clitics (cont.)
A question whether %+ғана%<postadv%>:% %{G%}ана # ; ! "only"
in CLIT
continuation class was correct produced a discussion about whether we should handle harmonizing of such words in transducer (means matching them to the previous word) or post-generator can take care of that.
Another thing is that some Tatar modal particles do not vary depending of the previous word (e.g. "бит"), but I have put them into CLIT
continuation class (as all other modal particles were there). This might be wrong.
I learned a lot of new stuff :), but the possible changes in CLIT lexicon were kept for later.
Some work on postadverbs
See Postadverbs
Wednesday, 30th May 2012
Had to study for a "zachet", not much done, but:
Went over numerals again, some additions
Started categorizing postpositions depending on what case they govern
Their "case-governance" often mismatches between the the two languages, so some transfer rules will be required.
I'll need help to set up coverage-measuring scripts and to learn how I can testvoc only certain POS's.
Also I think that I need another story :) To keep testing things on a parallel text much earlier than midterm comes is a good idea anyway.