Difference between revisions of "Thaana romanisation"

From Apertium
Jump to navigation Jump to search
(New page: Currently we are using romanized form of thaana letters instead of using actual unicode thaana letters. This makes things a lot easier for us. The translated romanized output from english ...)
 
Line 1: Line 1:
Currently we are using romanized form of thaana letters instead of using actual unicode thaana letters.
Currently we are using romanised form of Thaana letters instead of using actual Unicode Thaana letters.
This makes things a lot easier for us. The translated romanized output from english to dhivehi can be
This makes things a lot easier for us. The translated romanised output from English to Dhivehi can be
converted to unicode by a simple mapping. This mapping is as follows:
converted to Unicode by a simple mapping. This mapping is as follows:


<pre>
h <char-0x0780> "letter haa
h <char-0x0780> "letter haa
S <char-0x0781> "shaviani
S <char-0x0781> "shaviani
Line 56: Line 57:
O <char-0x07AF> "oaboafili
O <char-0x07AF> "oaboafili
q <char-0x07B0> "sukun
q <char-0x07B0> "sukun

</pre>


Thaana is written in right to left. however, for romanisation, we use from left to right. so ->
Thaana is written in right to left. however, for romanisation, we use from left to right. so ->

"I am a fisherman"
:"I am a fisherman"

outputs:
outputs:

"waharenqnakI masqveriwewq" (read from left to right)
:"waharenqnakI masqveriwewq" (read from left to right)

which is
which is

"އަހަރެންނަކީ މަސްވެރިއްއް" (read from _right_ to left)
:"އަހަރެންނަކީ މަސްވެރިއްއް" (read from _right_ to left)

Revision as of 10:13, 8 March 2009

Currently we are using romanised form of Thaana letters instead of using actual Unicode Thaana letters. This makes things a lot easier for us. The translated romanised output from English to Dhivehi can be converted to Unicode by a simple mapping. This mapping is as follows:

h	<char-0x0780> "letter haa
S	<char-0x0781> "shaviani
n	<char-0x0782> "noonu
r	<char-0x0783> "raa
b	<char-0x0784> "baa
L	<char-0x0785> "lhaviani
k	<char-0x0786> "kaafu
w	<char-0x0787> "alifu  
v	<char-0x0788> "vaavu
m	<char-0x0789> "meemu
f	<char-0x078A> "faafu
d	<char-0x078B> "dhaalu
t	<char-0x078C> "thaa
l	<char-0x078D> "laamu
g	<char-0x078E> "gaafu
N	<char-0x078F> "gnaviani
s	<char-0x0790> "seenu
D	<char-0x0791> "daviani
z	<char-0x0792> "zaviani
T	<char-0x0793> "taviani
y	<char-0x0794> "yaa
p	<char-0x0795> "paviani
j	<char-0x0796> "javiani
c	<char-0x0797> "chaviani

"THAANA DOTTED LETTERS (used in arabic words)
X	<char-0x0798> "TTAA   (thaa mathee thin thiki)
H	<char-0x0799> "HHAA   (haa thiree ehthiki)
K	<char-0x079A> "KHAA   (haa mathee ehthiki)
J	<char-0x079B> "THAALU (dhaa mathee ehthiki)
R	<char-0x079C> "ZAA    (raa mathee ehthiki)
C	<char-0x079D> "SHEENU (seenu mathee thinthiki)
M	<char-0x079E> "SAADHU (seenu thiree ehthiki)
B	<char-0x079F> "DHAADHU(seenu mathee ehthiki)
Y	<char-0x07A0> "TO     (thaa thiree ehthiki)
Z	<char-0x07A1> "ZO     (thaa mathee ehthiki)
W 	<char-0x07A2> "AINU   (alifu thiree ehthiki)
G	<char-0x07A3> "GHAINU (alifu mathee ehthiki)
Q	<char-0x07A4> "QAAFU  (gaafu mathee dhethkiki)
V	<char-0x07A5> "VAAVU  (vaavu mathee ehthiki)

"THAANA FILI (combining characters)
a	<char-0x07A6> "abafili
A	<char-0x07A7> "aabaafili
i	<char-0x07A8> "ibifili
I	<char-0x07A9> "eebeefili
u	<char-0x07AA> "ubufili
U	<char-0x07AB> "ooboofili
e	<char-0x07AC> "ebefili
E	<char-0x07AD> "ebeyfili
o	<char-0x07AE> "obofili
O	<char-0x07AF> "oaboafili
q	<char-0x07B0> "sukun

Thaana is written in right to left. however, for romanisation, we use from left to right. so ->

"I am a fisherman"

outputs:

"waharenqnakI masqveriwewq" (read from left to right)

which is

"އަހަރެންނަކީ މަސްވެރިއްއް" (read from _right_ to left)