Difference between revisions of "Bosnian-Croatian-Montenegrin-Serbian and Macedonian"

From Apertium
Jump to navigation Jump to search
Line 16: Line 16:
* <s>Adding two additional modes to the monodix (ek/ijek) so that lemmas containing yat can be analysed both as ekavian and ijekavian</s>
* <s>Adding two additional modes to the monodix (ek/ijek) so that lemmas containing yat can be analysed both as ekavian and ijekavian</s>
* Update the makefile and the xslt machinery so that this works
* Update the makefile and the xslt machinery so that this works

Verbs(Marked for aspect and transitivity)
Modal verbs:
Modal verbs:
*<s>Verb to be (biti)</s>
*<s>Verb to be (biti)</s>
*<s>Clitic verb htjeti, to mark future,</s>
*<s>Clitic verb htjeti, to mark future,</s>
Verbs(Marked for aspect and transitivity)
*<s>suffixes for present, imperfect and aorist</s>
*<s>suffixes for present, imperfect and aorist</s>
*the futureII aux verb (perfective present of to be) is already in the dictionary, but needs to be marked accordingly.
*the futureII aux verb (perfective present of to be) is already in the dictionary, but needs to be marked accordingly.
*same goes for the aorist of to be marking conditional
*same goes for the aorist of to be marking conditional
*the l-participle needs more detailed marking, behaves differently in respect to number
*the l-participle needs more detailed marking, behaves differently in respect to number

Adjectives:
Adjectives:
* <s>One paradigm added, with quite extensive marking</s>
* One paradigm added, with quite extensive marking
* marked for definiteness


Nouns:
* Add the paradigms from the grammar of Croatian (the one by Barić, Lončarić, Malić, Pavešić, Peti, Zečević, Znika) to the sh monodix [in progress]
*masculine : a great deal of paradigms covered
*feminine : some general cases
*neuter : some general cases

Closed word categories:
** <s>prepositions (including the ones of type s/sa and k/ka, which need to be postprocessed in generation)</s>
** <s>prepositions (including the ones of type s/sa and k/ka, which need to be postprocessed in generation)</s>
** <s>conjunctions</s>
** <s>conjunctions</s>
** <s>interjections</s>
** <s>interjections</s>
** particles
** particles
** pronouns (<s>personal</s>, <s>reflexive</s>, possesive, interrogative, relational, demonstrative (pronoun and adjective), indefinite, negative, ...
** nouns (masculine, feminine, neuter)

** adjectives (the definite and indefinite form paradigms)
* Add the paradigms from the grammar of Croatian (the one by Barić, Lončarić, Malić, Pavešić, Peti, Zečević, Znika) to the sh monodix [in progress]
** verbs
* <s>Add the personal clitic and non-clitic pronouns</s>, <s>add the reflexive clitic and non-clitic pronouns</s>, possesive, interrogative, relational, demonstrative (pronoun, and demonstrative adjective), indefinite, negative ...
* <s>Add the clitic form of the verb to be</s>, <s>the long present form</s>, other tenses auxilliary verbs
* Obtain a grammar of Serbian, for reference on differences
* Obtain a grammar of Serbian, for reference on differences



Revision as of 12:44, 23 May 2011

Progress of the work in the bonding period

Insofar, a new dictionary has been started from scratch, some paradigms added from the grammar of croatian, along with some closed word categories. The most extensive work has been done with male noun paradigms, and seems that most work will be done with nouns. Adjectives are inherently more work, but there is less variation. To the dictionary paradigms have been added for verbs, for the present, aorist, imperfect, and futureI tense (the combination of the clitic 'ću' with the infinitive).

Todo

Testing framework
  • Set up pending/regression tests framework
  • Set testvoc
  • Set up corpus/generation-test
Serbo-Croatian dictionary

The reflex of yat:

  • Adding two additional modes to the monodix (ek/ijek) so that lemmas containing yat can be analysed both as ekavian and ijekavian
  • Update the makefile and the xslt machinery so that this works

Verbs(Marked for aspect and transitivity) Modal verbs:

  • Verb to be (biti)
  • Clitic verb htjeti, to mark future,
  • suffixes for present, imperfect and aorist
  • the futureII aux verb (perfective present of to be) is already in the dictionary, but needs to be marked accordingly.
  • same goes for the aorist of to be marking conditional
  • the l-participle needs more detailed marking, behaves differently in respect to number

Adjectives:

  • One paradigm added, with quite extensive marking
  • marked for definiteness

Nouns:

  • masculine : a great deal of paradigms covered
  • feminine : some general cases
  • neuter : some general cases

Closed word categories:

    • prepositions (including the ones of type s/sa and k/ka, which need to be postprocessed in generation)
    • conjunctions
    • interjections
    • particles
    • pronouns (personal, reflexive, possesive, interrogative, relational, demonstrative (pronoun and adjective), indefinite, negative, ...
  • Add the paradigms from the grammar of Croatian (the one by Barić, Lončarić, Malić, Pavešić, Peti, Zečević, Znika) to the sh monodix [in progress]
  • Obtain a grammar of Serbian, for reference on differences
Macedonian dictionary
  • Add determiner forms for some pronouns (e.g demonstratives, possessives, etc.) -- things that can modify nouns
Bilingual dictionary
  • Update the pronoun entries, the symbols in the monodix have been adjusted to correspond more closely to the analysis in the macedonian monodix
Transfer rules

See also

External links

References