Thaana romanisation

From Apertium
Revision as of 10:08, 8 March 2009 by Vaasu (talk | contribs) (New page: Currently we are using romanized form of thaana letters instead of using actual unicode thaana letters. This makes things a lot easier for us. The translated romanized output from english ...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Currently we are using romanized form of thaana letters instead of using actual unicode thaana letters. This makes things a lot easier for us. The translated romanized output from english to dhivehi can be converted to unicode by a simple mapping. This mapping is as follows:

h <char-0x0780> "letter haa S <char-0x0781> "shaviani n <char-0x0782> "noonu r <char-0x0783> "raa b <char-0x0784> "baa L <char-0x0785> "lhaviani k <char-0x0786> "kaafu w <char-0x0787> "alifu v <char-0x0788> "vaavu m <char-0x0789> "meemu f <char-0x078A> "faafu d <char-0x078B> "dhaalu t <char-0x078C> "thaa l <char-0x078D> "laamu g <char-0x078E> "gaafu N <char-0x078F> "gnaviani s <char-0x0790> "seenu D <char-0x0791> "daviani z <char-0x0792> "zaviani T <char-0x0793> "taviani y <char-0x0794> "yaa p <char-0x0795> "paviani j <char-0x0796> "javiani c <char-0x0797> "chaviani

"THAANA DOTTED LETTERS (used in arabic words) X <char-0x0798> "TTAA (thaa mathee thin thiki) H <char-0x0799> "HHAA (haa thiree ehthiki) K <char-0x079A> "KHAA (haa mathee ehthiki) J <char-0x079B> "THAALU (dhaa mathee ehthiki) R <char-0x079C> "ZAA (raa mathee ehthiki) C <char-0x079D> "SHEENU (seenu mathee thinthiki) M <char-0x079E> "SAADHU (seenu thiree ehthiki) B <char-0x079F> "DHAADHU(seenu mathee ehthiki) Y <char-0x07A0> "TO (thaa thiree ehthiki) Z <char-0x07A1> "ZO (thaa mathee ehthiki) W <char-0x07A2> "AINU (alifu thiree ehthiki) G <char-0x07A3> "GHAINU (alifu mathee ehthiki) Q <char-0x07A4> "QAAFU (gaafu mathee dhethkiki) V <char-0x07A5> "VAAVU (vaavu mathee ehthiki)

"THAANA FILI (combining characters) a <char-0x07A6> "abafili A <char-0x07A7> "aabaafili i <char-0x07A8> "ibifili I <char-0x07A9> "eebeefili u <char-0x07AA> "ubufili U <char-0x07AB> "ooboofili e <char-0x07AC> "ebefili E <char-0x07AD> "ebeyfili o <char-0x07AE> "obofili O <char-0x07AF> "oaboafili q <char-0x07B0> "sukun

Thaana is written in right to left. however, for romanisation, we use from left to right. so -> "I am a fisherman" outputs: "waharenqnakI masqveriwewq" (read from left to right) which is "އަހަރެންނަކީ މަސްވެރިއްއް" (read from _right_ to left)