Difference between revisions of "WX notation"
Line 2: | Line 2: | ||
==Table== |
==Table== |
||
+ | |||
+ | ==Details== |
||
+ | |||
+ | <anudev> there r some issues of assigning some letters of hindi with Unicode |
||
+ | <anudev> still unresolved |
||
+ | <anudev> actually there is the issue of separate vowels and matras |
||
+ | <avinesh_> could u give an example |
||
+ | <anudev> we don't need the vowels and matras(markers) differently |
||
+ | <avinesh_> because for every matra there is a mapping in wx |
||
+ | <anudev> like a, aa, ii, u r there |
||
+ | <avinesh_> yeah |
||
+ | <anudev> but again ी ू े |
||
+ | <anudev> are not needed |
||
+ | <spectie> matras = ? |
||
+ | <avinesh_> matra is the later representation |
||
+ | <anudev> matras= markers |
||
+ | <anudev> ka |
||
+ | <anudev> kaa |
||
+ | <anudev> we will write kaa as kA in wx |
||
+ | <anudev> in unicode there is a separate place for both A and the marker aa |
||
+ | <anudev> we need a same code for both of them, |
||
+ | <avinesh_> sry still not getting ur point why should we use wx instead of unicode? |
||
+ | <avinesh_> but people only follow one convention either the A or aa |
||
+ | <avinesh_> not both |
||
+ | <avinesh_> i mean if u see a document |
||
+ | <avinesh_> it will generally be consistent |
||
+ | <anudev> I mean we write A for both the vowel and matra |
||
+ | <avinesh_> oh.. |
||
+ | <avinesh_> ok |
||
+ | <avinesh_> got it |
||
+ | <anudev> but unicode will write differently for A as a vowel and matra |
||
+ | <avinesh_> k got it |
||
+ | <anudev> so it creates unnecessary complication |
||
+ | <spectie> so the problem is that in unicode |
||
+ | <spectie> combining characters have a separate code point |
||
+ | <spectie> and in WX they are unified to one code point? |
||
+ | <spectie> = letter |
||
+ | <anudev> yes |
||
+ | <spectie> why not use unicode normalisation ? |
||
+ | </pre> |
||
==Examples== |
==Examples== |
Revision as of 12:09, 28 March 2009
WX notation is used to represent the Devanagari alphabet, used by Hindi, Nepali, Marathi, Bengali and many other Indian languages in ASCII.
Contents
Table
Details
<anudev> there r some issues of assigning some letters of hindi with Unicode <anudev> still unresolved <anudev> actually there is the issue of separate vowels and matras <avinesh_> could u give an example <anudev> we don't need the vowels and matras(markers) differently <avinesh_> because for every matra there is a mapping in wx <anudev> like a, aa, ii, u r there <avinesh_> yeah <anudev> but again ी ू े <anudev> are not needed <spectie> matras = ? <avinesh_> matra is the later representation <anudev> matras= markers <anudev> ka <anudev> kaa <anudev> we will write kaa as kA in wx <anudev> in unicode there is a separate place for both A and the marker aa <anudev> we need a same code for both of them, <avinesh_> sry still not getting ur point why should we use wx instead of unicode? <avinesh_> but people only follow one convention either the A or aa <avinesh_> not both <avinesh_> i mean if u see a document <avinesh_> it will generally be consistent <anudev> I mean we write A for both the vowel and matra <avinesh_> oh.. <avinesh_> ok <avinesh_> got it <anudev> but unicode will write differently for A as a vowel and matra <avinesh_> k got it <anudev> so it creates unnecessary complication <spectie> so the problem is that in unicode <spectie> combining characters have a separate code point <spectie> and in WX they are unified to one code point? <spectie> = letter <anudev> yes <spectie> why not use unicode normalisation ?
Examples
- राम = र्+आ+म्+अ (rAma)
- कृष्ण = क्+ऋ+ष्+ण्+अ (kqRNa)
External links
- WX notation: Overview
- NLP: A paninian perspective (page 191)