Difference between revisions of "Bosnian-Croatian-Montenegrin-Serbian and Macedonian"

From Apertium
Jump to navigation Jump to search
Line 15: Line 15:
Adjectives:
Adjectives:
The animacy when crossed with definiteness gives a lot of double entries. Since some of the cases (G, D/L,I singular, and D/L/I plural for instance) do not specifically mark a gender, I have removed the animacy in those cases and in accord marked them "mn", or "mfn".
The animacy when crossed with definiteness gives a lot of double entries. Since some of the cases (G, D/L,I singular, and D/L/I plural for instance) do not specifically mark a gender, I have removed the animacy in those cases and in accord marked them "mn", or "mfn".
*Idea: unify the D/L/I plural into one case, and D/L singular into one case, since they are always morphologically identical.
<s>*Idea: unify the D/L/I plural into one case, and D/L singular into one case, since they are always morphologically identical.
** The paradigms entered would be more concise
** The paradigms entered would be more concise
** Would complicate matters with future translation pairs with an other slavic languages, i.e. Slovene
** Would complicate matters with future translation pairs with an other slavic languages, i.e. Slovene
** If incorporating dialect words into the dictionary (i.e. kajkavian or čakavian), the separate markers for cases would have to be used
** If incorporating dialect words into the dictionary (i.e. kajkavian or čakavian), the separate markers for cases would have to be used</s>
*In the macedonian monodix no adjective is marked positive, only comparative and superlative, therefore I'm taking the same approach.

* Add the paradigms from the grammar of Croatian (the one by Barić, Lončarić, Malić, Pavešić, Peti, Zečević, Znika) to the sh monodix [in progress]
* Add the paradigms from the grammar of Croatian (the one by Barić, Lončarić, Malić, Pavešić, Peti, Zečević, Znika) to the sh monodix [in progress]
** <s>prepositions (including the ones of type s/sa and k/ka, which need to be postprocessed in generation)</s>
** <s>prepositions (including the ones of type s/sa and k/ka, which need to be postprocessed in generation)</s>

Revision as of 13:26, 2 May 2011

Progress of the work in the bonding period

Insofar, a new dictionary has been started from scratch, some paradigms added from the grammar of croatian, along with some closed word categories. For details see the Todo list.

Todo

Testing framework
  • Set up pending/regression tests framework
  • Set testvoc
  • Set up corpus/generation-test
Serbo-Croatian dictionary

Adjectives: The animacy when crossed with definiteness gives a lot of double entries. Since some of the cases (G, D/L,I singular, and D/L/I plural for instance) do not specifically mark a gender, I have removed the animacy in those cases and in accord marked them "mn", or "mfn". *Idea: unify the D/L/I plural into one case, and D/L singular into one case, since they are always morphologically identical.

    • The paradigms entered would be more concise
    • Would complicate matters with future translation pairs with an other slavic languages, i.e. Slovene
    • If incorporating dialect words into the dictionary (i.e. kajkavian or čakavian), the separate markers for cases would have to be used
  • In the macedonian monodix no adjective is marked positive, only comparative and superlative, therefore I'm taking the same approach.
  • Add the paradigms from the grammar of Croatian (the one by Barić, Lončarić, Malić, Pavešić, Peti, Zečević, Znika) to the sh monodix [in progress]
    • prepositions (including the ones of type s/sa and k/ka, which need to be postprocessed in generation)
    • conjunctions
    • interjections
    • particles
    • nouns (masculine, feminine, neuter)
    • adjectives (the definite and indefinite form paradigms)
    • verbs
  • Add the personal clitic and non-clitic pronouns, add the reflexive clitic and non-clitic pronouns, possesive, interrogative, relational, demonstrative (pronoun, and demonstrative adjective), indefinite, negative ...
  • Add the clitic form of the verb to be, the long present form, other tenses auxilliary verbs
  • Obtain a grammar of Serbian, for reference on differences
Macedonian dictionary
  • Add determiner forms for some pronouns (e.g demonstratives, possessives, etc.) -- things that can modify nouns
Bilingual dictionary
  • Update the pronoun entries, the symbols in the monodix have been adjusted to correspond more closely to the analysis in the macedonian monodix
Transfer rules

See also

External links

References