WX notation

From Apertium
Jump to navigation Jump to search

En français

WX notation is used to represent the Devanagari alphabet, which is used by Sanskrit, Hindi, Nepali, Marathi, Bengali and many other Indian languages in ASCII. Devanagari script also has Unicode support.



<anudev> There are some issues regarding assigning some letters of Hindi with Unicode. 
<anudev> They are still unresolved.
<anudev> Actually there is the issue of separate vowels and matras.
<avinesh_> Could you give an example?
<anudev> We don't need the vowels and matras(markers) differently.
<avinesh_> For every matra there is a mapping in wx.
<anudev> Like a, aa, ii.
<avinesh_> yeah 
<anudev> but again ी ू े 
<anudev> are not needed
<spectie> matras = ?
<avinesh_> matra is the later representation
<anudev> matras= markers
<anudev> ka
<anudev> kaa
<anudev> we will write kaa as kA in wx
<anudev> in unicode there is a separate place for both A and the marker aa
<anudev> we need a same code for both of them,
<avinesh_> sry still not getting ur point why should we use wx instead of unicode?
<avinesh_> but people only follow one convention either the A or aa 
<avinesh_> not both
<avinesh_> i mean if u see a document 
<avinesh_> it will generally be consistent
<anudev> I mean we write A for both the vowel and matra
<avinesh_> oh..
<avinesh_> ok
<avinesh_> got it
<anudev> but unicode will write differently for A as a vowel and matra
<avinesh_> k got it
<anudev> so it creates unnecessary complication
<spectie> so the problem is that in unicode
<spectie> combining characters have a separate code point
<spectie> and in WX they are unified to one code point?
<spectie> = letter
<anudev> yes
<spectie> why not use unicode normalisation ?


  • राम = र्+आ+म्+अ (rAma)
  • कृष्ण = क्+ऋ+ष्+ण्+अ (kqRNa)

External links[edit]