Difference between revisions of "Malayalam and English/documentation"

From Apertium
Jump to navigation Jump to search
Line 126: Line 126:


== Verbs ==
== Verbs ==
{|class=wikitable
! Form !! Description !! Tag !! Example !! Translation
|-
| Present || {{sc|stem}}k-unnu || {{tag|pres}} || kuttikal kalikkunnu<br/>children play || The children are playing.
|-
| Future || {{sc|stem}}-um || {{tag|fut}} || naale mala peyyum<br/>tomorrow rain will.fall || It will rain tomorrow.
|-
| Present progressive || {{sc|pres}}k-unt || || aval nannaayi pathikkunt<br/>she well studying.is || She is studying well.
|-
| Present progressive (II) || {{sc|inf}} ān || || siita avite irikkuka ān<br/>Sita there sit is. || Sita is sitting there.
|-
| Iterative present || {{sc|stem}}-kontu-iri-kk-unnu || || avan paatikkontirikkunnu<br/> He singing.is || He is singing.
|-
| Iterative fut || {{sc|stem}}-kontu-iri-kk-um || || avan paatikkontirikkum<br/>He singing.will.be || He will be singing.
|-
| Iterative past || {{sc|stem}}-kontu-iri-unnu || || avan paatikkontirunnu<br/>He singing.was || He was singing.
|-
| Continuous iterative || {{sc|stem}}-konte-iri-kunnu || || kuttikal paatikkonteeyirunnu<br/>children sang.without.stopping || The children sang without stopping
|-
| Perfect || || || innale mala peytirunnu<br/>yesterday rain fell || It rained yesterday.
|-
| Contemporaneous perfect || || || yuddham pottippurappettirikkunnu<br/>war broken#out.has || War has broken out!
|-
| Remote perfect || || || ñaan paattŭ pathiccittuntŭ<br/>I music studied.had || I had studied music.
|-
| Habitual present || || || juun maasattil mala peyyaaruntŭ<br/>June month.in rain falls.usually || It usually rains in June.
|-
| Habitual past || || || ñaan delhiyil pookaaruntaayirunnu<br/>I Delhi.to go.used#to || I used to go to Delhi.
|-
| Imperative || || || putiya vidyaarthikal hedmaasrrare kaaneentataanŭ<br/>new students headmaster meet.should || New students should meet the headmaster.
|-
| Promissive || {{sc|past}}-ām || || ñaan naale varaam<br/>I tomorrow come.will || I will come tomorrow.
|-
| Emphatic promissive || {{sc|past}}-ēk-ām || || ñaan naale vanneekkaam<br/>I tomorrow come.will || I will come tomorrow.
|-
| Permissive || {{sc|past}}-ō (kolluu) || || vannoo<br/>you.may.come || You may come.
|-
| Permissive (II) || {{sc|past}}-ootte || || avan avite irunnootte<br/>He there sit.let || Let him sit there.
|-
| Permissive (III) || || || avar avite taamasikkatte<br/>He there sit.let || Let him sit there.
|-
| Permissive (Formal) || || || paas ullavarkkŭ itilee pookaavunnatŭ aanŭ<br/>pass having this.way go.may is || Those who have a pass may go this way.
|-
| Optative || || || mala peyyatte<br/>rain fall.let || Let it rain.
|-
| Precative || {{sc|stem}}-anē (= {{sc|stem}}-uka-vēnam-ē) || || mala peyyanee<br/>rain fall.may || May it rain.
|-
|}
<pre>
<pre>
%<quot%>
%<quot%>

Revision as of 16:54, 16 August 2014

Malayalam is both agglutinative and inflective language . it belong dravidian language category . In apertium we are trying to implement englaish malayalam pair using hfst .it is described here Starting_a_new_language_with_HFST

Morphotactic using lexc

let's take an example of a noun word , malayalam noun can have 8 inflections , nominative,dative, instrumental, locative ,accusative,vocative and sociative . it can be also classified on the basis on number ,singular and plural , let's declare essential symbols

Multichar_Symbols

%<n%>           ! Noun                        ! നാമം

%<nom%>         ! Nominative                  !

%<acc%>         ! Accusative                  !

%<dat%>         ! Dative                      !

%<soc%>         ! Sociative                   !

%<gen%>         ! Genitive                    !

%<ins%>         ! Instrumental                !

%<loc%>         ! Locative                    !

%<voc%>         ! Vocative                    !

%<sg%>          ! Singular                    !

%<pl%>          ! Plural                      !

Now we have all essential symbols for a noun , let's add an example paradigm

LEXICON Root

Miscellaneous ;
Conjunctions ; 
Postpositions ;
Pronouns ;
Determiners ;
Numerals ;
NominalStems ;

Nouns

LEXICON N1 

%<n%>%<sg%>%<nom%>:ṁ CLIT-N-NOM ; ! ṁ
%<n%>%<sg%>%<loc%>:%>ttil‍ CLIT-N-LOC ; ! ttil
%<n%>%<sg%>%<acc%>:%>tte CLIT-N-ACC ; ! tte
%<n%>%<sg%>%<gen%>:%>ttinṟe CLIT-N-GEN ; ! ttinṟe
%<n%>%<sg%>%<dat%>:%>ttin CLIT-N ; ! ttin
%<n%>%<sg%>%<dat%>:%>ttinu CLIT-N ; ! ttinu ! debug
!plural
%<n%>%<pl%>%<nom%>:%>ṅṅaḷ‍ CLIT-N-NOM ; ! ṅṅaḷ‍
%<n%>%<pl%>%<acc%>:%>ṅṅaḷe CLIT-N-ACC ; ! ṅṅale
%<n%>%<pl%>%<gen%>:%>ṅṅaḷuṭe CLIT-N-GEN ; ! ṅṅaḷuṭe

and an example word

LEXICON NominalStems
mēghaṁ:mēgha N1 ; ! mēghaṁ ! cloud

Currently there are 10 noun paradigms ,N1 N2 ,...N10

General trends in paradigms

LEXICON N1 :- Words ending with anusuvara( ം )

LEXICON N2 :- words ending with the vowel a or i

LEXICON N3 :- words ending with virama or a vowel

LEXICON N4 :- words ending with the vowel a or i

LEXICON N5 :- words ends with virama (eg വീട് )

LEXICON N6 :- for the word പേര്‍ (pēr‍)

LEXICON N7 :-words ends with the vowel u

LEXICON N8 :-

LEXICON N9 :-

LEXICON N10 :-

Proper Nouns

  • LEXICON NP* is for proper nouns , nature of the proper nouns are almost similar to nouns
  • LEXICON NP*-COG is for second name
  • LEXICON NP-TOP-* represents place names

they are

  1. NP-TOP-KERALA :- Place names ending with anusuvara
  2. NP-TOP-INDIA :- Place names ending with the vowel a or i
  3. NP-TOP-CALICUT :- Place names ending with virama
  4. NP-TOP-KANNUR :- Place names ending with chillu
  5. NP-TOP-MALABAR : -place name ending with chillu R(ര്‍ )
  6. NP-TOP-JAPAN :- place names ending with chilllu ന്‍
  7. NP-TOP-BRAZIL :-place name ending with chillu ല്‍

ProNouns

  • PRON-PERS-* represents personal pronouns

they are

  1. PRON-PERS-NNAAN :-
  2. PRON-PERS-NII :-
  3. PRON-PERS-AVAN :-
  4. PRON-PERS-AVAL :-
  5. PRON-PERS-NNANNAL
  6. PRON-PERS-NAAM
  7. PRON-PERS-NINNAL
  8. PRON-PERS-AVAR
  9. PRON-PERS-ADDEHA
  • PRON-DEM is for demonstrative pronoun

they are

  1. PRON-DEM-AT
  2. PRON-DEM-IT
  • PRON-IND is for Indefinite pronoun

Numerals

NUM for numerals

Verbs

Form Description Tag Example Translation
Present stemk-unnu <pres> kuttikal kalikkunnu
children play
The children are playing.
Future stem-um <fut> naale mala peyyum
tomorrow rain will.fall
It will rain tomorrow.
Present progressive presk-unt aval nannaayi pathikkunt
she well studying.is
She is studying well.
Present progressive (II) inf ān siita avite irikkuka ān
Sita there sit is.
Sita is sitting there.
Iterative present stem-kontu-iri-kk-unnu avan paatikkontirikkunnu
He singing.is
He is singing.
Iterative fut stem-kontu-iri-kk-um avan paatikkontirikkum
He singing.will.be
He will be singing.
Iterative past stem-kontu-iri-unnu avan paatikkontirunnu
He singing.was
He was singing.
Continuous iterative stem-konte-iri-kunnu kuttikal paatikkonteeyirunnu
children sang.without.stopping
The children sang without stopping
Perfect innale mala peytirunnu
yesterday rain fell
It rained yesterday.
Contemporaneous perfect yuddham pottippurappettirikkunnu
war broken#out.has
War has broken out!
Remote perfect ñaan paattŭ pathiccittuntŭ
I music studied.had
I had studied music.
Habitual present juun maasattil mala peyyaaruntŭ
June month.in rain falls.usually
It usually rains in June.
Habitual past ñaan delhiyil pookaaruntaayirunnu
I Delhi.to go.used#to
I used to go to Delhi.
Imperative putiya vidyaarthikal hedmaasrrare kaaneentataanŭ
new students headmaster meet.should
New students should meet the headmaster.
Promissive past-ām ñaan naale varaam
I tomorrow come.will
I will come tomorrow.
Emphatic promissive past-ēk-ām ñaan naale vanneekkaam
I tomorrow come.will
I will come tomorrow.
Permissive past-ō (kolluu) vannoo
you.may.come
You may come.
Permissive (II) past-ootte avan avite irunnootte
He there sit.let
Let him sit there.
Permissive (III) avar avite taamasikkatte
He there sit.let
Let him sit there.
Permissive (Formal) paas ullavarkkŭ itilee pookaavunnatŭ aanŭ
pass having this.way go.may is
Those who have a pass may go this way.
Optative mala peyyatte
rain fall.let
Let it rain.
Precative stem-anē (= stem-uka-vēnam-ē) mala peyyanee
rain fall.may
May it rain.
%<quot%>
%<enum%>        ! Enumerative                 !

%<subst%>       ! Substantive                 !
%<attr%>        ! Attributive                 !

%<iv%>          ! Intransitive                ! 
%<tv%>          ! Transitive                  ! 

%<neg%>         ! Negative                    !

%<pres%>        ! Present tense               ! വര്‍ത്ത്മാന കാലം 
%<past%>        ! Past tense                  ! ഭൂത കാലം 
%<fut%>         ! Future tense                ! 
%<perf%>        ! Simple Perfect              !
%<rem_perf%>    ! Remote Perfect              !
%<contpres%>    ! Contemporaneous perfect     !
%<perm%>        ! Permissive mood             !
%<imp%>         ! Imperative mood             !
%<hab%>         ! Habitual aspect             ! 

%<prec%>         ! Precative mood             ! 
%<opt%>          ! Optative mood              !
%<irre%>         ! Irrealis mood              !
%<satis%>        ! satisfactive mood              !  
%<monit%>        ! monitory  mood              !    

%<frml%>        ! Formal                      !
%<infml%>       ! InFormal                    !

%<inf_k%>       ! Infinitive                  !
%<inf_n%>       ! Purposive infinitive        !
%<oblig%>       ! Obligative                  ! 
%<simul%>       ! Simultaneous                !

%<iter%>        ! Iterative                   !
%<cond%>        ! Conditional Mood                 !

%<gpr_pres%>    ! 
%<gpr_past%>    !

currently 25+ verb paradigms are added to lexc , unlike noun , it is difficult to predict paradigm by checking word pattern

Verb paradigm are of the form

LEXICON V-TV-ARIYUKA

%<v%>%<tv%>: V-COMMON-ARIYUKA ; ! ""

and

LEXICON V-IV-KALI

%<v%>%<iv%>: V-COMMON-KALI ; ! ""

here IV-Intransitive Verb

TV-Transitive Verb

continuation paradigm v-common-* is added to both Eg : v-common-atikkuka

LEXICON V-COMMON-ATIKKUKA
%<inf_k%>:%>kku k CLIT-CC ; ! "" ! FIXME
%<inf_n%>:%>kkan‍ CLIT-CC ; ! "̔" !
%<perf%>:%>chchiru nnu  CLIT-ITG ; ! "̔" !
%<rem_perf%>:%>chchiṭṭu ṇṭ # ; ! "' !""
%<pres%>:%>kku nnu  NEG-WHEN ; ! "̔̔"
%<past%>:%>chchu  NEG-WHEN ; ! ""
%<fut%>:%>kku ' NEG-WHEN ; ! ""
%<pass%>:%>kkppe PASS-CONT ;
%<iter%>:%>chchu koṇṭi ITER-TENS; ! ""
%<iter%>%<cont%>:%>chchu koṇṭeyi ITER-TENS; ! ""
%<gpr_pres%>:%>kku nn GPR-PRES ; ! ""
%<gpr_past%>:%>chch GPR-PAST ; ! ""
%<hab%>:%>kkar‍ CLIT-COP-UNTU ; ! "ār"
%<imp%>:%>kk CONT_IMP; !""
%<pcpl%>:%>chch # ; ! ""
%<contpres%>:%>chchirikku nnu  # ;
%<prec%>:%>kk PREC-CONT ; ! "" ! ""
%<opt%>:%>kkṭṭe # ;
%<irre%>%<past%>:%>chchene # ;
%<cond%>:%>chchal‍  NEG-WHEN ; ! "ccāl‍"
%<monit%>%<fut%>:%>kku me # ;
%<satis%>%<fut%>:%>kku mllo #;
%<satis%>%<past%>:%>chchllo #;
%<satis%>%<pres%>:%>kku nnllo #;
%<oblig%>:%>kkṇ' # ;

it contain continuation lexicons like CLIT-CC,CLIT-ITG, NEG-WHEN etc

  • passive verbs are added using the continuation lexicon PASS-CONT (%<pass%>:%>kkppe PASS-CONT ;)
  • passive verb lexicon is defined as
LEXICON PASS-CONT
%<inf_k%>:%>ṭu k CLIT-CC ; ! "" ! FIXME
%<inf_n%>:%>ṭan‍ CLIT-CC ; ! "̔" ! 
%<perf%>:%>ṭṭiru nnu  CLIT-ITG ; ! "̔" ! 
%<pres%>:%>ṭu nnu  NEG-WHEN ; ! "̔̔"
%<past%>:%>ṭṭu  NEG-WHEN ; ! ""
%<fut%>:%>ṭu ' NEG-WHEN ; ! ""
%<iter%>%<pres%>:%>ṭṭu koṇṭirikku nnu  NEG-WHEN ; ! ""
%<iter%>%<past%>:%>ṭṭu koṇṭiru nnu  NEG-WHEN ; ! ""
%<iter%>%<fut%>:%>ṭṭu koṇṭirikku ' NEG-WHEN ; ! ""
%<gpr_pres%>:%>ṭu nn GPR-PRES ; ! ""
%<gpr_past%>:%>ṭṭ GPR-PAST ; ! ""
%<imp%>:%>ṭ # ; !""
%<imp%>%<frml%>:%>ṭṇ' # ; !""
%<imp%>%<infml%>:%>ṭu  # ; !""
%<hab%>:%>ṭar‍ CLIT-COP-UNTU ; ! "ār"
%<pcpl%>:%>ṭṭ # ; ! ""
%<contpres%>:%>ṭṭikku nnu  # ; 
%<prec%>:%>ṭṇe # ; 
%<opt%>:%>ṭṭṭe # ; 
%<irre%>%<past%>:%>ṭṭene # ; 
%<cond%>:%>ṭṭal‍  NEG-WHEN ; ! "
%<monit%>%<fut%>:%>ṭu me # ;
%<satis%>%<fut%>:%>ṭu mllo #;
%<satis%>%<pres%>:%>ṭu nnllo #;
%<satis%>%<past%>:%>chchllo #;
%<oblig%>:%>ṭṇ' # ; 
%<itg%>:%>ṭu mo # ; 
  • Imperative mood is added using the continuation lexicon CONT_IMP ( %<imp%>:%>ക്ക CONT_IMP; !"" )
LEXICON CONT_IMP
 # ; !""
%<frml%>:%>ṇ' # ; !""
%<infml%>:%>u  # ; !""
  • Verbal adjectives are added using the continuation lexicon GPR-PRES
LEXICON GPR-PRES

%<subst%>:%>ത  N3-COMMON; 
# ;

Adjectives

4 paradigms are included in apertium

  1. LEXICON A1
  2. LEXICON A2
  3. LEXICON A3
  4. LEXICON A4

Adverbs

5 adverb paradigms are added

  1. ADV
  2. ADV1
  3. ADV2
  4. ADV3
  5. ADV4

Post Positions

Malayalam Sandhi Rules Implementation

Refer Malayalam_and_English/sandh_in_malayalam

BiLingual Dictionary

for mapping source language word to target language word. it acts like a dictionary Read Bilingual_dictionary

Eg


<e><p><l>അഭിസംബോധന<s n="n"/></l><r>address<s n="n"/></r></p></e>
<e><p><l>കമ്യൂണിസ്റ്റ്<s n="n"/></l><r>communist<s n="n"/></r></p></e>
<e><p><l>ഗോത്രം<s n="n"/></l><r>caste<s n="n"/></r></p></e>

Transfer Rules

Structural transfer module , A_long_introduction_to_transfer_rules