<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.apertium.org/w/index.php?action=history&amp;feed=atom&amp;title=User%3AEden%2FGSOC2020_Swahili-Lingala</id>
	<title>User:Eden/GSOC2020 Swahili-Lingala - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.apertium.org/w/index.php?action=history&amp;feed=atom&amp;title=User%3AEden%2FGSOC2020_Swahili-Lingala"/>
	<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=User:Eden/GSOC2020_Swahili-Lingala&amp;action=history"/>
	<updated>2026-04-09T14:00:55Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.34.1</generator>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=User:Eden/GSOC2020_Swahili-Lingala&amp;diff=72236&amp;oldid=prev</id>
		<title>Eden: Created page with &quot;== Goal == Create a usable ‘Swahili-Lingala’ language pair. &lt;br/&gt;  == Swahili and Lingala resources == Here is a list of &#039;&#039;open&#039;&#039; and &#039;&#039;public domain&#039;&#039; resources(dictionar...&quot;</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=User:Eden/GSOC2020_Swahili-Lingala&amp;diff=72236&amp;oldid=prev"/>
		<updated>2020-05-19T05:03:52Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;== Goal == Create a usable ‘Swahili-Lingala’ language pair. &amp;lt;br/&amp;gt;  == Swahili and Lingala resources == Here is a list of &amp;#039;&amp;#039;open&amp;#039;&amp;#039; and &amp;#039;&amp;#039;public domain&amp;#039;&amp;#039; resources(dictionar...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;== Goal ==&lt;br /&gt;
Create a usable ‘Swahili-Lingala’ language pair. &amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Swahili and Lingala resources ==&lt;br /&gt;
Here is a list of &amp;#039;&amp;#039;open&amp;#039;&amp;#039; and &amp;#039;&amp;#039;public domain&amp;#039;&amp;#039; resources(dictionaries, grammar books, texts, etc) for Swahili:&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*Corpus/frequency list/bigram&lt;br /&gt;
- [https://github.com/thefreezer/apertium-swa/blob/master/dev/wikipedia_corpus.txt ~7m word corpus](needs a little bit more work) &amp;lt;br/&amp;gt;&lt;br /&gt;
- [http://crubadan.org/writingsystems An Crúbadán] &amp;lt;br /&amp;gt;&lt;br /&gt;
*Dictionary&lt;br /&gt;
- [https://kamusi.org/swahili-english-wordlist-2008 Swa-Eng] and [https://kamusi.org/english-swahili-wordlist-2008 Eng-Swa]&amp;lt;br/&amp;gt;&lt;br /&gt;
- [https://archive.org/details/swahilienglishdi00mada/page/n15/mode/2up Madan A.C.,1846], [https://archive.org/details/englishswahilid00madagoog/page/n13/mode/2up Madan A. C.,1902][https://archive.org/details/Swahili-englishDictionary/mode/2up Charles, W. R.]&amp;lt;br/&amp;gt;&lt;br /&gt;
- [https://github.com/freedict/fd-dictionaries/tree/master/swh-eng Freedict] &amp;lt;br/&amp;gt;&lt;br /&gt;
*Grammar rules&lt;br /&gt;
- [https://en.wikipedia.org/wiki/Swahili_grammar Wikipedia&amp;#039;s Grammar Rules]&amp;lt;br/&amp;gt;&lt;br /&gt;
- [https://archive.org/details/swahiligrammarvo00burtiala/page/96/mode/2up Burt, A. E,1910] &amp;lt;br/&amp;gt;&lt;br /&gt;
- [https://archive.org/details/ERIC_ED012888/page/n123/mode/2up Follome]&amp;lt;br/&amp;gt;&lt;br /&gt;
- [https://archive.org/details/ERIC_ED012888/page/n123/mode/2up Steerie, Edward]&amp;lt;br/&amp;gt;&lt;br /&gt;
*Other&lt;br /&gt;
- [http://www.language-archives.org/language/swh Language Archive] &amp;lt;br/&amp;gt;&lt;br /&gt;
- [https://wals.info/languoid/lect/wals_code_swa WALS] &amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Work plan ==&lt;br /&gt;
Community bonding period(May 4-June 1)&lt;br /&gt;
 - See [[User:Eden/GSoC_progress]]&lt;br /&gt;
&lt;br /&gt;
 Week 1(June 1-7): &lt;br /&gt;
 - adding nouns(from frequency list) in the lin transducer&lt;br /&gt;
 - Add nouns (from frequency list) in the swa transducer&lt;br /&gt;
 - Work on vowels&lt;br /&gt;
 - Constraint grammar for nouns&lt;br /&gt;
 - Add verbs&lt;br /&gt;
&lt;br /&gt;
 Week 2(June 8-14):&lt;br /&gt;
 - adding pronouns and adjectives in the swa transducer &lt;br /&gt;
 - Continue work on verbs&lt;br /&gt;
 - Reference: kaz and lin transducers&lt;br /&gt;
 - Add prepositions and pronouns, conjunctions&lt;br /&gt;
 - Work on numerals&lt;br /&gt;
 - CG for all the above&lt;br /&gt;
&lt;br /&gt;
 Week 3(June 15-21):  &lt;br /&gt;
 - Regression testing&lt;br /&gt;
 - Test and polish transducer(work on bi-grams)&lt;br /&gt;
 - Finish adding adverbs, conjunctions, prepositions, etc&lt;br /&gt;
 - Start work on bilingual dictionary&lt;br /&gt;
&lt;br /&gt;
 Week 4(June 22-28):  &lt;br /&gt;
 - Add nouns and adjectives in bidix&lt;br /&gt;
 - Transfer rules for nouns and adjectives(both directions)&lt;br /&gt;
 - Disambiguation rules&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Deliverable #1(June 29): Advanced Swahili transducer(&amp;gt;10k entries) with basic bilingual dictionary&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
&lt;br /&gt;
 Week 5(June 29-July 5):  &lt;br /&gt;
 - Continue work on bidix: add nouns and verbs &lt;br /&gt;
 - Focus on verbs&lt;br /&gt;
 - Transfer rules from eng-lin, kaz-eng, and eng-fre&lt;br /&gt;
 - Transfer rules for verbs in both directions&lt;br /&gt;
&lt;br /&gt;
 Week 6(July 6-12):  &lt;br /&gt;
 - Add pronouns and transfer rules for them&lt;br /&gt;
 - Add adverbs&lt;br /&gt;
 - Wok on compound Swahili words&lt;br /&gt;
 - Transfer rules for pronouns, adverbs and compound nouns(both directions)&lt;br /&gt;
&lt;br /&gt;
 Week 7(July 13-19): &lt;br /&gt;
 - Goal: well defined macros for verbs and pronouns&lt;br /&gt;
 - WER &amp;lt; 35% on 500 word story&lt;br /&gt;
 - add/polish rules for concordance between verbs and pronouns&lt;br /&gt;
&lt;br /&gt;
 Week 8(July 20-26): &lt;br /&gt;
 - Continue work on transfer rules&lt;br /&gt;
 - Work on disambiguation rules&lt;br /&gt;
 - Lots of testing and improvements&lt;br /&gt;
 - WER &amp;lt; 30% in both directions on a 1,000-word story&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Deliverable #2(July 3): Advanced bilingual dictionary(~15,000 words) and transfer rules&amp;#039;&amp;#039;&amp;#039; ...&lt;br /&gt;
&lt;br /&gt;
 Week 9(July 27-August 2) :&lt;br /&gt;
 - Continue work on disambiguation(both directions)&lt;br /&gt;
 - Testvoc and improvements&lt;br /&gt;
 - Filling bidix&lt;br /&gt;
&lt;br /&gt;
 Week 10(August 3-9):&lt;br /&gt;
 - Work on transfer rules&lt;br /&gt;
 - goal is WER ~30% on a story greater &amp;gt; 1000 words&lt;br /&gt;
&lt;br /&gt;
 Week 11(August 10-16):&lt;br /&gt;
 - Continue work on transfer rules and testing&lt;br /&gt;
 - Wikipedia article translations&lt;br /&gt;
 - Continue filling bidix&lt;br /&gt;
&lt;br /&gt;
 Week 12(August 17-23):&lt;br /&gt;
 - Continue filling bidix with miscellaneous words&lt;br /&gt;
 - Detailed analysis of work completed(wiki)&lt;br /&gt;
 - (if work done well, start working on new pairs)&lt;br /&gt;
 - Evaluation of results and documentation&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Submit Code and Final Evaluations(August 24-31): WER &amp;lt; 30%(with ~20,000 words in bidix) in both directions on most texts&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;&lt;/div&gt;</summary>
		<author><name>Eden</name></author>
		
	</entry>
</feed>