Difference between revisions of "Marathi-Hindi Developer Documentation"

From Apertium
Jump to navigation Jump to search
(Documentation of mar-hin development)
 
(Add genitive explanation)
Line 1: Line 1:
This is documentation of significant changes made to apertium-mar-hin since 20171128.
+
This is documentation of significant changes made to [https://sourceforge.net/p/apertium/svn/HEAD/tree/incubator/apertium-mar-hin/ apertium-mar-hin] (and the individual language modules [https://sourceforge.net/p/apertium/svn/HEAD/tree/languages/apertium-mar/ apertium-mar] and [https://sourceforge.net/p/apertium/svn/HEAD/tree/languages/apertium-hin/ apertium-hin]) since 20171128.
   
  +
= apertium-mar =
* prn/det
 
   
  +
== Genitives ==
* genitives being classified as dets
 
   
  +
Consider the Marathi phrases:
* clit 'chya' getting a separate lemma
 
  +
* त्याचा घोडा (''tyacha ghoda'') = his horse
  +
* त्याची गाय (''tyachi gaay'') = his cow
  +
* तिचा घोडा (''ticha ghoda'') = her horse
  +
* तिची गाय (''tichi gaay'') = her cow
  +
  +
The possessive determiners are affected by the gender of the possessor—'his' versus 'her'—and also the gender of the possessed—घोडा (''ghoda'') is grammatically masculine and गाय (''gaay'') is feminine. So the analysis of the determiners must have separate lemmas for the possessor part and the possessed part, so that both genders can be specified. Thus these are analyzed as
  +
* <code>^त्याचा/तो<det><dist><m><sg><obl>+च<gen><m><sg><nom>$ ^घोडा/घोडा<n><m><sg><nom>$</code>
  +
* <code>^त्याची/तो<det><dist><m><sg><obl>+च<gen><f><sg><nom>$ ^गाय/गाय<n><f><sg><nom>$</code>
  +
* <code>^तिचा/तो<det><dist><f><sg><obl>+च<gen><m><sg><nom>$ ^घोडा/घोडा<n><m><sg><nom>$</code>
  +
* <code>^तिची/तो<det><dist><f><sg><obl>+च<gen><f><sg><nom>$ ^गाय/गाय<n><f><sg><nom>$</code>
  +
  +
The reason for the <code><obl></code> is explained in a section below.
  +
  +
Note: The <code>च<gen></code> lemma is the genitive marker for almost all words which can take a genitive: determiners like those above, nouns, postpositions, etc. The only exceptions are some determiners such as माझा (''my'' (masc. sg.)), आपला (''our'' (incl., masc. sg.)), etc. These are nevertheless analyzed as <code>^माझा/मी<det><p1><mf><sg><obl>+च<gen><m><sg><nom>$</code> etc. using the same lemma.
  +
  +
Another note: Marathi has three grammatical genders and two grammatical numbers, all of which have the above phenomenon.
  +
  +
== Analysis of pronouns and determiners ==
  +
  +
=== Clitics ===
  +
  +
== Constraint Grammar rules ==
  +
  +
  +
= apertium-mar-hin =
  +
  +
== Gender and number agreement ==

Revision as of 04:14, 28 January 2018

This is documentation of significant changes made to apertium-mar-hin (and the individual language modules apertium-mar and apertium-hin) since 20171128.

apertium-mar

Genitives

Consider the Marathi phrases:

  • त्याचा घोडा (tyacha ghoda) = his horse
  • त्याची गाय (tyachi gaay) = his cow
  • तिचा घोडा (ticha ghoda) = her horse
  • तिची गाय (tichi gaay) = her cow

The possessive determiners are affected by the gender of the possessor—'his' versus 'her'—and also the gender of the possessed—घोडा (ghoda) is grammatically masculine and गाय (gaay) is feminine. So the analysis of the determiners must have separate lemmas for the possessor part and the possessed part, so that both genders can be specified. Thus these are analyzed as

  • ^त्याचा/तो<det><dist><m><sg><obl>+च<gen><m><sg><nom>$ ^घोडा/घोडा<n><m><sg><nom>$
  • ^त्याची/तो<det><dist><m><sg><obl>+च<gen><f><sg><nom>$ ^गाय/गाय<n><f><sg><nom>$
  • ^तिचा/तो<det><dist><f><sg><obl>+च<gen><m><sg><nom>$ ^घोडा/घोडा<n><m><sg><nom>$
  • ^तिची/तो<det><dist><f><sg><obl>+च<gen><f><sg><nom>$ ^गाय/गाय<n><f><sg><nom>$

The reason for the <obl> is explained in a section below.

Note: The च<gen> lemma is the genitive marker for almost all words which can take a genitive: determiners like those above, nouns, postpositions, etc. The only exceptions are some determiners such as माझा (my (masc. sg.)), आपला (our (incl., masc. sg.)), etc. These are nevertheless analyzed as ^माझा/मी<det><p1><mf><sg><obl>+च<gen><m><sg><nom>$ etc. using the same lemma.

Another note: Marathi has three grammatical genders and two grammatical numbers, all of which have the above phenomenon.

Analysis of pronouns and determiners

Clitics

Constraint Grammar rules

apertium-mar-hin

Gender and number agreement