Difference between revisions of "WX notation"

From Apertium
Jump to navigation Jump to search
(unicode support is available)
Line 55: Line 55:
* [http://ltrc.iiit.net/downloads/nlpbook/nlp-panini.pdf NLP: A paninian perspective] (page 191) [comment: this link does not work. ----svaksha]
* [http://ltrc.iiit.net/downloads/nlpbook/nlp-panini.pdf NLP: A paninian perspective] (page 191) [comment: this link does not work. ----svaksha]


----
[[Category:Terminology]]
[[Category:Terminology]]

Revision as of 06:41, 13 October 2009

WX notation is used to represent the Devanagari alphabet, which is used by Sanskrit, Hindi, Nepali, Marathi, Bengali and many other Indian languages in ASCII. Devanagari script also has Unicode support.


Table

Details

<anudev> there r some issues of assigning some letters of hindi with Unicode 
<anudev> still unresolved
<anudev> actually there is the issue of separate vowels and matras
<avinesh_> could u give an example
<anudev> we don't need the vowels and matras(markers) differently
<avinesh_> because for every matra there is a mapping in wx
<anudev> like a, aa, ii, u r there
<avinesh_> yeah 
<anudev> but again ी ू े 
<anudev> are not needed
<spectie> matras = ?
<avinesh_> matra is the later representation
<anudev> matras= markers
<anudev> ka
<anudev> kaa
<anudev> we will write kaa as kA in wx
<anudev> in unicode there is a separate place for both A and the marker aa
<anudev> we need a same code for both of them,
<avinesh_> sry still not getting ur point why should we use wx instead of unicode?
<avinesh_> but people only follow one convention either the A or aa 
<avinesh_> not both
<avinesh_> i mean if u see a document 
<avinesh_> it will generally be consistent
<anudev> I mean we write A for both the vowel and matra
<avinesh_> oh..
<avinesh_> ok
<avinesh_> got it
<anudev> but unicode will write differently for A as a vowel and matra
<avinesh_> k got it
<anudev> so it creates unnecessary complication
<spectie> so the problem is that in unicode
<spectie> combining characters have a separate code point
<spectie> and in WX they are unified to one code point?
<spectie> = letter
<anudev> yes
<spectie> why not use unicode normalisation ?

Examples

  • राम = र्+आ+म्+अ (rAma)
  • कृष्ण = क्+ऋ+ष्+ण्+अ (kqRNa)

External links