Apertium Indic

From Apertium
Jump to navigation Jump to search

Apertium Indic is a subfamily of Apertium-based systems for Indic/Indo-Aryan languages. This has the potential to also include Dravidian languages, in the future. This page defines a set of standards and definitions that ought to generalise fairly cross-linguistically across the entire family, and that should be followed as much as possible. A standard reference book often used is The Indo-Aryan Languages [1].

Currently, Marathi has the most comprehensive morphological analyser in Apertium and should be used as a reference for building other analysers.

Morphology[edit]

Nominal[edit]

  • Follow Masica's three-level morphology wherever possible: nouns are nominative (<nom>), altered roots are oblique (<obl>), altered roots with cases directly mark case, postpositions attach to the oblique.
  • The main difference between a case and a postposition is whether there can be an intervening element between the oblique and the postposition. Other differences are subjective, but languages oughtn't to have more than 10 cases at most.
  • Genitives are separated from the root despite being a traditional case.
  • Pronouns should not be treated as fusional and should analyse the same way a noun would, whether more complex forms have been lexicalised or not.

Verbal[edit]

  • All verb forms must be covered - this goes without saying.
  • Verbs typically inflect for perfective/imperfective aspect, not past/present tense. Tense is imparted by copulas.
  • Gerunds exist and typically overlap with infinitives, but can take cases/postpositions.
  • Light verbs (N + V) should be treated as separate elements. If N cannot exist as an independent word, gloss it as <adv>.
  • Compound verbs (V + V) receive no special treatment.

Verb form table[edit]

Massive TODO

Active contributors[edit]

If you're contributing to any Indic language, add yourself here.

Name IRC nick Indic projects involved in
Vinit Ravishankar vin-ivar Marathi

Bibliography[edit]

[1] Masica, C.P. (1993) The Indo-Aryan Languages. Cambridge Language Surveys. Cambridge University Press. https://books.google.com.mt/books?id=Itp2twGR6tsC