Difference between revisions of "Bengali and English/Anubadok"
Darthxaher (talk | contribs) |
Darthxaher (talk | contribs) |
||
Line 391: | Line 391: | ||
− | Verb: |
+ | Verb: কর (do) - Causative |
{| class="prettytable" |
{| class="prettytable" |
||
Line 434: | Line 434: | ||
| Present continuous |
| Present continuous |
||
| N/A |
| N/A |
||
+ | | াচ্ছি |
||
− | | ছি |
||
+ | | াচ্ছেন |
||
− | | ছেন |
||
+ | | াচ্ছ |
||
− | | ছ |
||
+ | | াচ্ছিস |
||
− | | ছিস |
||
+ | | াচ্ছেন |
||
− | | ছেন |
||
+ | | াচ্ছে |
||
− | | ছে |
||
|- |
|- |
||
| Future simple |
| Future simple |
||
| N/A |
| N/A |
||
− | | |
+ | | াব |
+ | | াবেন |
||
− | | বেন |
||
+ | | াবে |
||
− | | বে |
||
+ | | াবি |
||
− | | বি |
||
+ | | াবেন |
||
− | | বেন |
||
+ | | াবে |
||
− | | বে |
||
|- |
|- |
||
| Simple past |
| Simple past |
||
| N/A |
| N/A |
||
+ | | ালাম |
||
− | | লাম |
||
+ | | ালেন |
||
− | | লেন |
||
+ | | ালে |
||
− | | লে |
||
+ | | ালি |
||
− | | লি |
||
+ | | ালেন |
||
− | | লেন |
||
− | | |
+ | | াল |
|- |
|- |
||
| Habitual past (Imperfect) |
| Habitual past (Imperfect) |
||
| N/A |
| N/A |
||
+ | | াতাম |
||
− | | তাম |
||
+ | | াতেন |
||
− | | তেন |
||
+ | | াতে |
||
− | | তে |
||
+ | | াতি |
||
− | | তি |
||
+ | | াতেন |
||
− | | তেন |
||
− | | |
+ | | াত |
|- |
|- |
||
| Conditional past (I would do) |
| Conditional past (I would do) |
||
| N/A |
| N/A |
||
− | | |
+ | | াব |
+ | | াবেন |
||
− | | বেন |
||
+ | | াবে |
||
− | | বে |
||
+ | | াবি |
||
− | | বি |
||
+ | | াবেন |
||
− | | বেন |
||
+ | | াবে |
||
− | | বে |
||
|- |
|- |
||
| Continuous past |
| Continuous past |
||
| N/A |
| N/A |
||
+ | | াচ্ছিলাম |
||
− | | ছিলাম |
||
+ | | াচ্ছিলেন |
||
− | | ছিলেন |
||
+ | | াচ্ছিলে |
||
− | | ছিলে |
||
+ | | াচ্ছিলি |
||
− | | ছিলি |
||
+ | | াচ্ছিলেন |
||
− | | ছিলেন |
||
+ | | াচ্ছিল |
||
− | | ছিল |
||
|- |
|- |
||
| perfect |
| perfect |
||
| N/A |
| N/A |
||
+ | | িয়েছি |
||
− | | েছি |
||
+ | | িয়েছেন |
||
− | | েছেন |
||
+ | | িয়েছ |
||
− | | েছ |
||
+ | | িয়েছিস |
||
− | | েছিস |
||
+ | | িয়েছেন |
||
− | | েছেন |
||
+ | | িয়েছে |
||
− | | েছে |
||
|- |
|- |
||
| pluperfect(past perfect) |
| pluperfect(past perfect) |
||
| N/A |
| N/A |
||
+ | | িয়েছিলাম |
||
− | | েছিলাম |
||
+ | | িয়েছিলেন |
||
− | | েছিলেন |
||
+ | | িয়েছিলে |
||
− | | েছিলে |
||
+ | | িয়েছিলি |
||
− | | েছিলি |
||
+ | | িয়েছিলেন |
||
− | | েছিলেন |
||
+ | | িয়েছিল |
||
− | | েছিল |
||
|- |
|- |
||
| Past participle |
| Past participle |
||
+ | | িয়ে |
||
− | | ে |
||
| N/A |
| N/A |
||
| N/A |
| N/A |
||
Line 523: | Line 523: | ||
|- |
|- |
||
| Conditional participle |
| Conditional participle |
||
+ | | ালে |
||
− | | লে |
||
| N/A |
| N/A |
||
| N/A |
| N/A |
||
Line 533: | Line 533: | ||
|- |
|- |
||
| Present imperative |
| Present imperative |
||
− | | |
+ | | াও |
| N/A |
| N/A |
||
| N/A |
| N/A |
||
Line 543: | Line 543: | ||
|- |
|- |
||
| Future imperative |
| Future imperative |
||
− | | |
+ | | াস |
| N/A |
| N/A |
||
| N/A |
| N/A |
||
Line 554: | Line 554: | ||
| Future Continuous |
| Future Continuous |
||
| N/A |
| N/A |
||
− | | |
+ | | াতে থাকব |
− | | |
+ | | াতে থাকবেন |
− | | |
+ | | াতে থাকবে |
− | | |
+ | | াতে থাকবি |
− | | |
+ | | াতে থাকবেন |
− | | |
+ | | াতে থাকবে |
|- |
|- |
||
| (I Might have done) |
| (I Might have done) |
||
| N/A |
| N/A |
||
− | | |
+ | | াতে থাকব |
− | | |
+ | | াতে থাকবেন |
− | | |
+ | | াতে থাকবে |
− | | |
+ | | াতে থাকবি |
− | | |
+ | | াতে থাকবেন |
− | | |
+ | | াতে থাকবে |
|} |
|} |
Revision as of 04:38, 15 April 2009
Anubadok is an open source English to Bengali MT system developed by G M Hossain, currently in experimental stage.
The program is licensed under GPL. Its accessible from here.
Inflection Rules (BnSondhi.pm)
Legend
- C Consonant
- V Vowel
- _ Any Letter
- K Kar (Short form of Vowel)
- G General Rule - The consonants and vowels in the example are exchangeable with any other consonants and vowels
- S Special Rule - The consonants and vowels in the example are NOT exchangeable with any other consonants and vowels
- H Hasanth - The joiner, e.g. ঙ+্+গ = ঙ্গ, as in মঙ্গল
__CC means the last two consonants of a word, and CV__ means the first letter of this word is consonant followed by a vowel
Rules
- Verb Rules (Applies if the length of the first word is >= 3, rules are not exclusive e.g. one word can qualify for multiple rules)
- G (C+ো+C)+(তে) = (C+ু+C+তে) e.g. খোল + তে = খুলতে
- G (া+C+া) + (তে) = (া+C+া+তে) e.g. পাঠা + তে = পাঠাতে
- G (C+ি)/(C+ে) + (ে+র)্(া+র) = (C+ে+ও+য়+া+র) e.g. নি + ের = নেওয়ার, নে + ার = নেওয়ার
- G (C+া) + (তে) = (C+ে+তে) e.g. পা + তে = পেতে
- G (C+ে) + (তে) = (C+ি+তে) e.g. দে + তে = দিতে
- G (C+C) + (ে+র) = (C+C+া+র) e.g. কর + ের = করার
- G (__K1) + (K2__) = (__K1__) [needs review]
- Preposition Rules
(Applies if the length of the first word is >= 2, rules are not exclusive e.g. one word can qualify for multiple rules)
- G (ল+ে+C) -> (ল+ি+C) e.g. লেখ -> লিখ
- G (এ+ই) + (টা) = (এটি) e.g. এই + টি = এটি
- S (_+ক+ে) + (ে+র) = (_+র) e.g. আমাকে + ের = আমার
- G (C+C) + (ৈ+র) = (C+C+া+র) e.g. কর + এর = করার
- G (C/V/H/K + C) + (তে) = (C/V/H/K + C + ে) [needs review]
- S (ং) + (ে+র) = (ং+এ+র) e.g. সং + ের = সংএর
- G (__K1) + (K2__) = (__K1__) [needs review]
- Main Rules
(Applies if the length of the first word is >= 2, rules are not exclusive e.g. one word can qualify for multiple rules)
- G (K+C) + (তে) = (K+C+ে) e.g. মার + তে = মারে [need review]
- G (__K1) + (K2__) = (__K1__) [needs review]
- Sondhi for progressive tag POS
- G (K) + (_) = (K)
- Basic Verb Shondhi
- G (_+ি) + (_) = (_+ে+ও+য়+_)
- Verb Shondhi for passive sentence
- G (C/K + C) + (_) = (C/K+C+_) else
- G (_+ি) + (_) = (_+ে+ও+য়+_)
- Verb Shondhi for active sentence
(Applies if the length of the first word is >= 1, rules are not exclusive e.g. one word can qualify for multiple rules)
- G (_ে) + (ো) = (_া+ও) e.g. নে + ো = নাও
- G (C+ে) + (_) = (C+ি+_) e.g. দে -> দি
- G (ল+ে+C) -> (ল+ি+C) e.g. লেখ -> লিখ [why anubadok repeats this rule?]
- 3rd person, present simple
- S (ল+ি+C) + (ে) = (ল+ে+C+ে) e.g. লিখ + ে-> লেখ + ে-> লেখে
- S (দি) + (ে) = দেয় e.g. দি + ে-> দে + ে-> দেয়
- G (C/K+C) + (ে) = (C/K+C+ে) else
- G (_) + (ে) = (_য়) e.g. হ + ে-> হয়, খা + ে-> খায়
- 2nd person, present simple
- S (ল+ি+C) + (েন) = (ল+ে+C+ে+ন) e.g. লিখ + েন-> লেখ + েন-> লেখেন
- G (C/V/ে+C) + (েন) = (C/V/ে+C+েন) e.g. বল + েন = বলেন, চাল + েন + চালেন
- G (imperative) লিখ + েন -> লিখ + ুন -> লিখুন
- G (C+K) + (েন) = (C+K+ন) বাড়া + েন -> বাড়ান
- 1st/2nd/3rd person, future simple
- G (_+ি) + (ব) = (_+ে+ব) e.g. দি + ব = দেব [This would also pass through other rules for other persons. eg. দেব + েন = দেবেন, from present tense]
- 1st/2nd/3rd person, past simple
- S (যা) + (ল/লাম/লেন) = (গা+ল/লাম/লেন)
- S (গা) + (ল/লাম/লেন) = (গে+ল/লাম/লেন)
- 1st person, present simple
- G (ল+ে+C) -> (ল+ি+C) e.g. লেখ -> লিখ
- S লিখ + ি = লিখি
- S (খা) + (ি) = খাই
- 1st/2nd/3rd person, present/past continuous
(CC) + (ছি/ছে) = (CCছি/ছে)(কর) + (ছি/ছে) e.g. করছি/করছে (_K) + (ছি/ছে) = (_K+চ+ছি/ছে) e.g. পড়া + ছি + পড়াচ্ছি
- 1st/2nd/3rd person, present/past, perfect
- যা + (ে+ছ+ে/ি)-> গি + (ে+ছ+ে/ি)
- (C+া+C) + (ে+ছ+ে/ি) -> (C+ে+C) + (ে+ছ+ে/ি)
- হ + ে + (ে+ছ+ে/ি)-> হয়ে + (ে+ছ+ে/ি)
- ঘটা + ে + (ে+ছ+ে/ি)-> ঘটা+ি+য়+ে + (ে+ছ+ে/ি)
- খা+ ে + (ে+ছ+ে/ি)-> খেয়ে + (ে+ছ+ে/ি)
Tag Set
Anubadok uses Penn Treebank Tag Set, the tag set is as follows:
Tag | Gloss | Example |
---|---|---|
CC |
Coordinating conjunction | |
CD |
Cardinal number | |
DT |
Determiner | |
EX |
Existential there | |
FW |
Foreign word | |
IN |
Preposition or subordinating conjunction | |
JJ |
Adjective | |
JJR |
Adjective, comparative | |
JJS |
Adjective, superlative | |
LS |
List item marker | |
MD |
Modal | |
NN |
Noun, singular or mass | |
NNS |
Noun, plural | |
NP |
Proper noun, singular | |
NPS |
Proper noun, plural | |
PDT |
Predeterminer | |
POS |
Possessive ending | |
PP |
Personal pronoun | |
PP$ |
Possessive pronoun | |
RB |
Adverb | |
RBR |
Adverb, comparative | |
RBS |
Adverb, superlative | |
RP |
Particle | |
SYM |
Symbol | |
TO |
to | |
UH |
Interjection | |
VB |
Verb, base form | |
VBD |
Verb, past tense | |
VBG |
Verb, gerund or present participle | |
VBN |
Verb, past participle | |
VBP |
Verb, non-3rd person singular present | |
VBZ |
Verb, 3rd person singular present | |
WDT |
Wh-determiner | |
WP |
Wh-pronoun | |
WP$ |
Possessive wh-pronoun | |
WRB |
Wh-adverb |
Inflection Table
Verb: কর (do)
Verbal noun (present participle/gerund) | া | N/A | N/A | N/A | N/A | N/A | N/A
|
Infinitive (to do) | তে/ ার জন্য | N/A | N/A | N/A | N/A | N/A | N/A |
Present simple | N/A | ি | েন | - | - | েন | ে |
Present continuous | N/A | ছি | ছেন | ছ | ছিস | ছেন | ছে |
Future simple | N/A | ব | বেন | বে | বি | বেন | বে |
Simple past | N/A | লাম | লেন | লে | লি | লেন | ল |
Habitual past (Imperfect) | N/A | তাম | তেন | তে | তি | তেন | ত |
Conditional past (I would do) | N/A | ব | বেন | বে | বি | বেন | বে |
Continuous past | N/A | ছিলাম | ছিলেন | ছিলে | ছিলি | ছিলেন | ছিল |
perfect | N/A | েছি | েছেন | েছ | েছিস | েছেন | েছে |
pluperfect(past perfect) | N/A | েছিলাম | েছিলেন | েছিলে | েছিলি | েছিলেন | েছিল |
Past participle | ে | N/A | N/A | N/A | N/A | N/A | N/A |
Conditional participle | লে | N/A | N/A | N/A | N/A | N/A | N/A |
Present imperative | - | N/A | N/A | N/A | N/A | N/A | N/A |
Future imperative | িস | N/A | N/A | N/A | N/A | N/A | N/A |
Future Continuous | N/A | তে থাকব | তে থাকবেন | তে থাকবে | তে থাকবি | তে থাকবেন | তে থাকবে |
(I Might have done) | N/A | ে থাকব | ে থাকবেন | ে থাকবে | ে থাকবি | ে থাকবেন | ে থাকবে |
Verb: কর (do) - Causative
Verbal noun (present participle/gerund) | ানো | N/A | N/A | N/A | N/A | N/A | N/A
|
Infinitive (to do) | াতে/ ানোর জন্য | N/A | N/A | N/A | N/A | N/A | N/A |
Present simple | N/A | াই | ান | াও | াস | ান | ায় |
Present continuous | N/A | াচ্ছি | াচ্ছেন | াচ্ছ | াচ্ছিস | াচ্ছেন | াচ্ছে |
Future simple | N/A | াব | াবেন | াবে | াবি | াবেন | াবে |
Simple past | N/A | ালাম | ালেন | ালে | ালি | ালেন | াল |
Habitual past (Imperfect) | N/A | াতাম | াতেন | াতে | াতি | াতেন | াত |
Conditional past (I would do) | N/A | াব | াবেন | াবে | াবি | াবেন | াবে |
Continuous past | N/A | াচ্ছিলাম | াচ্ছিলেন | াচ্ছিলে | াচ্ছিলি | াচ্ছিলেন | াচ্ছিল |
perfect | N/A | িয়েছি | িয়েছেন | িয়েছ | িয়েছিস | িয়েছেন | িয়েছে |
pluperfect(past perfect) | N/A | িয়েছিলাম | িয়েছিলেন | িয়েছিলে | িয়েছিলি | িয়েছিলেন | িয়েছিল |
Past participle | িয়ে | N/A | N/A | N/A | N/A | N/A | N/A |
Conditional participle | ালে | N/A | N/A | N/A | N/A | N/A | N/A |
Present imperative | াও | N/A | N/A | N/A | N/A | N/A | N/A |
Future imperative | াস | N/A | N/A | N/A | N/A | N/A | N/A |
Future Continuous | N/A | াতে থাকব | াতে থাকবেন | াতে থাকবে | াতে থাকবি | াতে থাকবেন | াতে থাকবে |
(I Might have done) | N/A | াতে থাকব | াতে থাকবেন | াতে থাকবে | াতে থাকবি | াতে থাকবেন | াতে থাকবে |
Trivia
In Bengali -
- Verbs do not inflect number. So "I go" - "আমি যাই", "We Go" - "আমরা যাই".
- Verbs do not inflect gender. So "He goes" - "সে যায়", "She goes" - "সে যায়".
- Nouns and Adjectives has genders. So "He is an old man" - "সে একজন বৃদ্ধ লোক", "She is an old woman" - "সে একজন বৃদ্ধা মহিলা"
- Pronouns do not have genders. "He" - "সে", "She" - "সে"