Difference between revisions of "Apertium-eng-srn"
| (11 intermediate revisions by 2 users not shown) | |||
| Line 6: | Line 6: | ||
| One major challenge in Sranan-English translation is that many verbs may also be used as their related nouns or adjectives (and vice versa).  To compound this problem, determiners (found before nouns) take very similar forms to personal pronouns (found before verbs), making it difficult to disambiguate words.  For example, 'mi singi' may be translated either as 'my song' or 'I sing'; 'den aksi' may mean either 'they ask' or 'the questions.' | One major challenge in Sranan-English translation is that many verbs may also be used as their related nouns or adjectives (and vice versa).  To compound this problem, determiners (found before nouns) take very similar forms to personal pronouns (found before verbs), making it difficult to disambiguate words.  For example, 'mi singi' may be translated either as 'my song' or 'I sing'; 'den aksi' may mean either 'they ask' or 'the questions.' | ||
| Apertium-srn is available on SVN [https://svn.code.sf.net/p/apertium/svn/incubator/apertium-srn/ here].  Apertium-eng-srn is available on SVN [https://svn.code.sf.net/p/apertium/svn/incubator/apertium-eng-srn/ here]. | |||
| ==Vocabulary Sources | Rutu fu den Wortu== | ==Vocabulary Sources | Rutu fu den Wortu== | ||
| Line 12: | Line 14: | ||
| ==Statistics | Den Statistiek== | ==Statistics | Den Statistiek== | ||
| {| | {| | ||
| |Sranan  | |Sranan stems | ||
| |'''{{#lst:Apertium-srn/stats|stems}}''' | |||
| |3,054 | |||
| |- | |- | ||
| | | |Sranan rlx rules | ||
| |'''{{#lst:Apertium-srn/stats|rlx_rules}}''' | |||
| |5,146 | |||
| |- | |- | ||
| |Sranan | |Sranan paradigms | ||
| |'''{{#lst:Apertium-srn/stats|paradigms}}''' | |||
| |23 | |||
| |- | |- | ||
| |English-Sranan stems | |||
| ⚫ | |||
| |'''{{#lst:Apertium-eng-srn/stats|eng-srn_stems}}''' | |||
| | | |- | ||
| ⚫ | |||
| |'''{{#lst:Apertium-eng-srn/stats|srn-eng_t1x_rules}}''' | |||
| |- | |||
| |Sranan-English t2x rules | |||
| |'''{{#lst:Apertium-eng-srn/stats|srn-eng_t2x_rules}}''' | |||
| |} | |} | ||
| Line 30: | Line 39: | ||
| * {{test|srn|Dan fa den ben wakawaka ini a oso a ijskasi bigin degedege.|Then because they wandered inside the house the refrigerator begins wobbling.}} | * {{test|srn|Dan fa den ben wakawaka ini a oso a ijskasi bigin degedege.|Then because they wandered inside the house the refrigerator begins wobbling.}} | ||
| * {{test|srn|Dan mi yere: BAM! A koba fadon tapu a patu nanga okro.| | * {{test|srn|Dan mi yere: BAM! A koba fadon tapu a patu nanga okro.|Then I hear: *BAM! The bowl falls on the pot with okra.}} | ||
| * {{test|srn|Dan a patu kanti fadon tapu mi bakasei.|Then the pot edge falls on my buttocks.}} | * {{test|srn|Dan a patu kanti fadon tapu mi bakasei.|Then the pot edge falls on my buttocks.}} | ||
| * {{test|srn|Mi dyompo opo. | * {{test|srn|Mi dyompo opo.Mi bari! Mi bari!|I jump up. I shout! I shout!}} | ||
| ==Progress | Sani na du == | ==Progress | Sani na du == | ||
| Line 49: | Line 58: | ||
| * Handling of causative structures (e.g. *mi mama taigi mi brada meki a tyari mi go na datra.*) | * Handling of causative structures (e.g. *mi mama taigi mi brada meki a tyari mi go na datra.*) | ||
| * Anthroponyms | * Anthroponyms | ||
| * Proper handling of  | * Proper handling of ''de'' as the existential particle when word-final (e.g. ''datra de'' -> "the doctor is there," not "the doctor is") | ||
| * Proper handling of multiwords | * Proper handling of multiwords | ||
| [[Category:Language pairs]] | |||
| [[Category:Sranan]] | |||
| [[Category:English]] | |||
Latest revision as of 21:59, 3 September 2017
Sranan Tongo (lit. "Surinamese tongue") is an English-based creole language spoken as a lingua franca by approximately 500,000 people in Suriname. Shared between the Dutch-, Indigenous-, Javanese-, Hindustani-, and Chinese-speaking communities, Sranan Tongo generally serves as a second language, although around 125,000 Surinamese speak it as a first language. Sranan Tongo's lexicon is a fusion of English, Dutch, Portuguese and Central and West African languages, with rampant use of Dutch loanwords. Although it has no direct relatives, Sranan Tongo displays numerous similarities with Krio, the lingua franca of Sierra Leone—to the point where the two languages are somwhat mutually intelligible—as well as with Atlantic English pidgins to a lesser extent.
This language pair is the first of its kind in existence online. Although much progress has been made in lexicon and grammar, more work is needed to improve grammar in the Sranan to English direction.
One major challenge in Sranan-English translation is that many verbs may also be used as their related nouns or adjectives (and vice versa). To compound this problem, determiners (found before nouns) take very similar forms to personal pronouns (found before verbs), making it difficult to disambiguate words. For example, 'mi singi' may be translated either as 'my song' or 'I sing'; 'den aksi' may mean either 'they ask' or 'the questions.'
Apertium-srn is available on SVN here. Apertium-eng-srn is available on SVN here.
Vocabulary Sources | Rutu fu den Wortu[edit]
Vocabulary is taken from SIL's Wortubuku fu Sranan Tongo.
Statistics | Den Statistiek[edit]
| Sranan stems | 3,083 | 
| Sranan rlx rules | 62 | 
| Sranan paradigms | 30 | 
| English-Sranan stems | 5,145 | 
| Sranan-English t1x rules | 22 | 
| Sranan-English t2x rules | 3 | 
Sample Translations[edit]
- (srn) Dan fa den ben wakawaka ini a oso a ijskasi bigin degedege. → Then because they wandered inside the house the refrigerator begins wobbling.
- (srn) Dan mi yere: BAM! A koba fadon tapu a patu nanga okro. → Then I hear: *BAM! The bowl falls on the pot with okra.
- (srn) Dan a patu kanti fadon tapu mi bakasei. → Then the pot edge falls on my buttocks.
- (srn) Mi dyompo opo.Mi bari! Mi bari! → I jump up. I shout! I shout!
Progress | Sani na du[edit]
- Turned Sranan Tongo dictionary (pdf) into machine-readable format
- Cleaned entries
- Paradigms created for most Sranan lemmas
- Advanced chunking implemented
- Different combinations of modal verbs, tense particles, and multiple verbs implemented in chunking
- Disambiguation constraint grammar entries added
To be done | San musu du[edit]
- English -> Sranan transfer rules—not yet tested
- Proper handling of adjectives with <sint>
- Handling of causative structures (e.g. *mi mama taigi mi brada meki a tyari mi go na datra.*)
- Anthroponyms
- Proper handling of de as the existential particle when word-final (e.g. datra de -> "the doctor is there," not "the doctor is")
- Proper handling of multiwords

