<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.apertium.org/w/index.php?action=history&amp;feed=atom&amp;title=User%3AEden%2FGSoC2019Report</id>
	<title>User:Eden/GSoC2019Report - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.apertium.org/w/index.php?action=history&amp;feed=atom&amp;title=User%3AEden%2FGSoC2019Report"/>
	<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=User:Eden/GSoC2019Report&amp;action=history"/>
	<updated>2026-04-09T10:58:45Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.34.1</generator>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=User:Eden/GSoC2019Report&amp;diff=70404&amp;oldid=prev</id>
		<title>Eden: /* Morphological Analyzer */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=User:Eden/GSoC2019Report&amp;diff=70404&amp;oldid=prev"/>
		<updated>2019-08-26T15:50:16Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Morphological Analyzer&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 15:50, 26 August 2019&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 5:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 5:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;==== Morphological Analyzer ====&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;==== Morphological Analyzer ====&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;Code is &#039;&#039;&#039;[https://github.com/apertium/apertium-lin here]&#039;&#039;&#039;.(I directly committed everything into the repo)&amp;lt;br/&amp;gt;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;Code is &#039;&#039;&#039;[https://github.com/apertium/apertium-lin here]&#039;&#039;&#039;.(I directly committed everything into the repo)&amp;lt;br/&amp;gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;Before GSoC 2019, the Lingala transducer was already fairly well-developed. It could accurately recognize and classify most part-of-speech. My work mainly consisted in adding more vocabulary and missing morphology. The transducer had ~700 stems before GSoC and as of now, it contains ~1,500 stems. The original goal was to have ~7,000 stems but due to a lack of digitized resources, I could only get so far. The Wikipedia dump was great because it provided me with a lof of vocabulary and it was also useful for diacritic restoration, but unfortunately it also contained a lof of French, Portuguese, and English words.&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;Before GSoC 2019, the Lingala transducer was already fairly well-developed. It could accurately recognize and classify most part-of-speech. My work mainly consisted in adding more vocabulary and missing morphology. The transducer had ~700 stems before GSoC and as of now, it contains ~1,500 stems. The original goal was to have ~7,000 stems but due to a lack of digitized resources, I could only get so far. The&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; &#039;&#039;&#039;[https://dumps.wikimedia.org/lnwiki/20190820/&lt;/ins&gt; Wikipedia dump&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;]&#039;&#039;&#039;&lt;/ins&gt; was great because it provided me with a lof of vocabulary and it was also useful for diacritic restoration, but unfortunately it also contained a lof of French, Portuguese, and English words.&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;I also added missing morphology for adjectives and pronouns to handle the &#039;old&#039; Lingala orthography. This increased coverage by about 4% at the time.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;I also added missing morphology for adjectives and pronouns to handle the &#039;old&#039; Lingala orthography. This increased coverage by about 4% at the time.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;My mentor, Jonorthwash, also added more spell relax rules(thanks again btw)&amp;lt;br/&amp;gt;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;My mentor, Jonorthwash, also added more spell relax rules(thanks again btw)&amp;lt;br/&amp;gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Eden</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=User:Eden/GSoC2019Report&amp;diff=70403&amp;oldid=prev</id>
		<title>Eden: /* Bilingual dictionary */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=User:Eden/GSoC2019Report&amp;diff=70403&amp;oldid=prev"/>
		<updated>2019-08-26T15:47:59Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Bilingual dictionary&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 15:47, 26 August 2019&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 23:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 23:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;Current state of the bilingual dictionary:&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;Current state of the bilingual dictionary:&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;* Stems: &#039;&#039;&#039;1,802&#039;&#039;&#039; ([https://github.com/apertium/apertium-eng-lin/blob/master/apertium-eng-lin.eng-lin.dix apertium-eng-lin.dix])&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;* Stems: &#039;&#039;&#039;1,802&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-added&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* Wikipedia naïve coverage: &#039;&#039;&#039;72.61%&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* Wikipedia naïve coverage: &#039;&#039;&#039;72.61%&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* Bible naïve coverage: &#039;&#039;&#039;90.50%&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* Bible naïve coverage: &#039;&#039;&#039;90.50%&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;* WER of [https://github.com/apertium/apertium-eng-lin/blob/master/dev/story_eng.txt story](Lin-Eng): &#039;&#039;&#039;47.15%&#039;&#039;&#039;[https://github.com/apertium/apertium-eng-lin/blob/master/dev/eng.o.txt Final eng output]&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;* WER of [https://github.com/apertium/apertium-eng-lin/blob/master/dev/story_eng.txt story](Lin-Eng): &#039;&#039;&#039;47.15%&#039;&#039;&#039;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; (&lt;/ins&gt;[https://github.com/apertium/apertium-eng-lin/blob/master/dev/eng.o.txt Final eng output]&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;* WER of &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;same&lt;/del&gt; story&lt;del class=&quot;diffchange diffchange-inline&quot;&gt; as the above&lt;/del&gt;(Eng-Lin): &#039;&#039;&#039;50.93%&#039;&#039;&#039;[https://github.com/apertium/apertium-eng-lin/blob/master/dev/lin.o.txt Final lin output]&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;* WER of &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;[https://github.com/apertium/apertium-eng-lin/blob/master/dev/story_lin.txt&lt;/ins&gt; story&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;]&lt;/ins&gt;(Eng-Lin): &#039;&#039;&#039;50.93%&#039;&#039;&#039;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; (&lt;/ins&gt;[https://github.com/apertium/apertium-eng-lin/blob/master/dev/lin.o.txt Final lin output]&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;* Lexical selection rules: &#039;&#039;&#039;~30&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;* Lexical selection rules: &#039;&#039;&#039;~30&#039;&#039;&#039;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; ([https://github.com/apertium/apertium-eng-lin/blob/master/apertium-eng-lin.lin-eng.lrx apertium-eng-lin.lrx])&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* Testvoc(Wikipedia corpus lin-eng):&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* Testvoc(Wikipedia corpus lin-eng):&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;** Number of tokenised words in the corpus: &#039;&#039;&#039;589,666&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;** Number of tokenised words in the corpus: &#039;&#039;&#039;589,666&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Eden</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=User:Eden/GSoC2019Report&amp;diff=70402&amp;oldid=prev</id>
		<title>Eden: /* Bilingual dictionary */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=User:Eden/GSoC2019Report&amp;diff=70402&amp;oldid=prev"/>
		<updated>2019-08-26T15:43:00Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Bilingual dictionary&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 15:43, 26 August 2019&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 26:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 26:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* Wikipedia naïve coverage: &#039;&#039;&#039;72.61%&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* Wikipedia naïve coverage: &#039;&#039;&#039;72.61%&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* Bible naïve coverage: &#039;&#039;&#039;90.50%&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* Bible naïve coverage: &#039;&#039;&#039;90.50%&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;* WER of [https://github.com/apertium/apertium-eng-lin/blob/master/dev/story_eng.txt story](Lin-Eng): &#039;&#039;&#039;47.15%&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;* WER of [https://github.com/apertium/apertium-eng-lin/blob/master/dev/story_eng.txt story](Lin-Eng): &#039;&#039;&#039;47.15%&#039;&#039;&#039;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;[https://github.com/apertium/apertium-eng-lin/blob/master/dev/eng.o.txt Final eng output]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;* WER of same story as the above(Eng-Lin): &#039;&#039;&#039;50.93%&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;* WER of same story as the above(Eng-Lin): &#039;&#039;&#039;50.93%&#039;&#039;&#039;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;[https://github.com/apertium/apertium-eng-lin/blob/master/dev/lin.o.txt Final lin output]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* Lexical selection rules: &#039;&#039;&#039;~30&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* Lexical selection rules: &#039;&#039;&#039;~30&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* Testvoc(Wikipedia corpus lin-eng):&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* Testvoc(Wikipedia corpus lin-eng):&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Eden</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=User:Eden/GSoC2019Report&amp;diff=70401&amp;oldid=prev</id>
		<title>Eden: /* Bilingual dictionary */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=User:Eden/GSoC2019Report&amp;diff=70401&amp;oldid=prev"/>
		<updated>2019-08-26T15:40:35Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Bilingual dictionary&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 15:40, 26 August 2019&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 16:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 16:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;==== Bilingual dictionary ====&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;==== Bilingual dictionary ====&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;Code is &#039;&#039;&#039;[https://github.com/apertium/apertium-eng-lin here]&#039;&#039;&#039;&amp;lt;br/&amp;gt;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;Code is &#039;&#039;&#039;[https://github.com/apertium/apertium-eng-lin here]&#039;&#039;&#039;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;(I directly committed everything into the repo.)&lt;/ins&gt;&amp;lt;br/&amp;gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;The bilingual dictionary was written from scratch. Vocabulary mainly came from dictionaries and personal knowledge. The bilingual dictionary([https://github.com/apertium/apertium-eng-lin/blob/master/apertium-eng-lin.eng-lin.dix apertium-eng-lin]) contains a lot of one-to-many words because of the ambiguous nature of Lingala. Then I also wrote some transfer rules for both directions.  A lot of rules and macros were recycled from more mature pairs(eng-fra, eng-cat) which makes the code cleaner and easier for adding rules later on. Transfer rules were limited only to first-level(.t1x) rules because other levels weren&#039;t yet necessary. Given the ambiguity of Lingala, I found lexical selection rules to be very effective in solving some of them. &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;The bilingual dictionary was written from scratch. Vocabulary mainly came from dictionaries and personal knowledge. The bilingual dictionary([https://github.com/apertium/apertium-eng-lin/blob/master/apertium-eng-lin.eng-lin.dix apertium-eng-lin]) contains a lot of one-to-many words because of the ambiguous nature of Lingala. Then I also wrote some transfer rules for both directions.  A lot of rules and macros were recycled from more mature pairs(eng-fra, eng-cat) which makes the code cleaner and easier for adding rules later on. Transfer rules were limited only to first-level(.t1x) rules because other levels weren&#039;t yet necessary. Given the ambiguity of Lingala, I found lexical selection rules to be very effective in solving some of them. &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Eden</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=User:Eden/GSoC2019Report&amp;diff=70400&amp;oldid=prev</id>
		<title>Eden: /* Morphological Analyzer */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=User:Eden/GSoC2019Report&amp;diff=70400&amp;oldid=prev"/>
		<updated>2019-08-26T15:39:17Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Morphological Analyzer&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 15:39, 26 August 2019&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 4:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 4:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;==Evaluation of Work Done==&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;==Evaluation of Work Done==&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;==== Morphological Analyzer ====&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;==== Morphological Analyzer ====&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;Code is &#039;&#039;&#039;[https://github.com/apertium/apertium-lin here]&#039;&#039;&#039;&amp;lt;br/&amp;gt;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;Code is &#039;&#039;&#039;[https://github.com/apertium/apertium-lin here]&#039;&#039;&#039;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;.(I directly committed everything into the repo)&lt;/ins&gt;&amp;lt;br/&amp;gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;Before GSoC 2019, the Lingala transducer was already fairly well-developed. It could accurately recognize and classify most part-of-speech. My work mainly consisted in adding more vocabulary and missing morphology. The transducer had ~700 stems before GSoC and as of now, it contains ~1,500 stems. The original goal was to have ~7,000 stems but due to a lack of digitized resources, I could only get so far. The Wikipedia dump was great because it provided me with a lof of vocabulary and it was also useful for diacritic restoration, but unfortunately it also contained a lof of French, Portuguese, and English words.&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;Before GSoC 2019, the Lingala transducer was already fairly well-developed. It could accurately recognize and classify most part-of-speech. My work mainly consisted in adding more vocabulary and missing morphology. The transducer had ~700 stems before GSoC and as of now, it contains ~1,500 stems. The original goal was to have ~7,000 stems but due to a lack of digitized resources, I could only get so far. The Wikipedia dump was great because it provided me with a lof of vocabulary and it was also useful for diacritic restoration, but unfortunately it also contained a lof of French, Portuguese, and English words.&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;I also added missing morphology for adjectives and pronouns to handle the &#039;old&#039; Lingala orthography. This increased coverage by about 4% at the time.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;I also added missing morphology for adjectives and pronouns to handle the &#039;old&#039; Lingala orthography. This increased coverage by about 4% at the time.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Eden</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=User:Eden/GSoC2019Report&amp;diff=70399&amp;oldid=prev</id>
		<title>Eden: /* Bilingual dictionary */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=User:Eden/GSoC2019Report&amp;diff=70399&amp;oldid=prev"/>
		<updated>2019-08-26T15:33:45Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Bilingual dictionary&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 15:33, 26 August 2019&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 26:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 26:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* Wikipedia naïve coverage: &#039;&#039;&#039;72.61%&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* Wikipedia naïve coverage: &#039;&#039;&#039;72.61%&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* Bible naïve coverage: &#039;&#039;&#039;90.50%&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* Bible naïve coverage: &#039;&#039;&#039;90.50%&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;* WER of [https://github.com/apertium/apertium-eng-lin &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;here&lt;/del&gt;](Lin-Eng): &#039;&#039;&#039;47.15%&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;* WER of [https://github.com/apertium/apertium-eng-lin&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;/blob/master/dev/story_eng.txt&lt;/ins&gt; &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;story&lt;/ins&gt;](Lin-Eng): &#039;&#039;&#039;47.15%&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* WER of same story as the above(Eng-Lin): &#039;&#039;&#039;50.93%&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* WER of same story as the above(Eng-Lin): &#039;&#039;&#039;50.93%&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* Lexical selection rules: &#039;&#039;&#039;~30&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* Lexical selection rules: &#039;&#039;&#039;~30&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Eden</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=User:Eden/GSoC2019Report&amp;diff=70396&amp;oldid=prev</id>
		<title>Eden: Created page with &quot;==Introduction== The goal of this project was to start the English-Lingala pair and and write an usable version which provides intelligible output.  ==Evaluation of Work Done=...&quot;</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=User:Eden/GSoC2019Report&amp;diff=70396&amp;oldid=prev"/>
		<updated>2019-08-26T15:32:29Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;==Introduction== The goal of this project was to start the English-Lingala pair and and write an usable version which provides intelligible output.  ==Evaluation of Work Done=...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;==Introduction==&lt;br /&gt;
The goal of this project was to start the English-Lingala pair and and write an usable version which provides intelligible output.&lt;br /&gt;
&lt;br /&gt;
==Evaluation of Work Done==&lt;br /&gt;
==== Morphological Analyzer ====&lt;br /&gt;
Code is &amp;#039;&amp;#039;&amp;#039;[https://github.com/apertium/apertium-lin here]&amp;#039;&amp;#039;&amp;#039;&amp;lt;br/&amp;gt;&lt;br /&gt;
Before GSoC 2019, the Lingala transducer was already fairly well-developed. It could accurately recognize and classify most part-of-speech. My work mainly consisted in adding more vocabulary and missing morphology. The transducer had ~700 stems before GSoC and as of now, it contains ~1,500 stems. The original goal was to have ~7,000 stems but due to a lack of digitized resources, I could only get so far. The Wikipedia dump was great because it provided me with a lof of vocabulary and it was also useful for diacritic restoration, but unfortunately it also contained a lof of French, Portuguese, and English words.&amp;lt;br/&amp;gt;&amp;lt;br/&amp;gt;&lt;br /&gt;
I also added missing morphology for adjectives and pronouns to handle the &amp;#039;old&amp;#039; Lingala orthography. This increased coverage by about 4% at the time.&lt;br /&gt;
My mentor, Jonorthwash, also added more spell relax rules(thanks again btw)&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Current state of the transducer:&amp;lt;br/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Stems: &amp;#039;&amp;#039;&amp;#039;1,524&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
* Wikipedia naïve coverage: &amp;#039;&amp;#039;&amp;#039;77.29%&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
* Bible naïve coverage: &amp;#039;&amp;#039;&amp;#039;93.72%&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
==== Bilingual dictionary ====&lt;br /&gt;
Code is &amp;#039;&amp;#039;&amp;#039;[https://github.com/apertium/apertium-eng-lin here]&amp;#039;&amp;#039;&amp;#039;&amp;lt;br/&amp;gt;&lt;br /&gt;
The bilingual dictionary was written from scratch. Vocabulary mainly came from dictionaries and personal knowledge. The bilingual dictionary([https://github.com/apertium/apertium-eng-lin/blob/master/apertium-eng-lin.eng-lin.dix apertium-eng-lin]) contains a lot of one-to-many words because of the ambiguous nature of Lingala. Then I also wrote some transfer rules for both directions.  A lot of rules and macros were recycled from more mature pairs(eng-fra, eng-cat) which makes the code cleaner and easier for adding rules later on. Transfer rules were limited only to first-level(.t1x) rules because other levels weren&amp;#039;t yet necessary. Given the ambiguity of Lingala, I found lexical selection rules to be very effective in solving some of them. &lt;br /&gt;
&lt;br /&gt;
Something to note is that Lingala has different dialects, each has grammar rules and an orthography that slightly differ from the rest. The two main dialects are Literary and Spoken Lingala. You can read more about it on this [http://www.lingref.com/cpp/acal/42/paper2778.pdf PDF]. Transfer rules and the Lingala transducer work best with Spoken Lingala. The Wikipedia corpus mostly contains Literary Lingala, while other texts(Bile, Quran) are written in Spoken Lingala. Which is why the Bible translation has a much more intelligible output than the Wikipedia translation.&lt;br /&gt;
&lt;br /&gt;
Current state of the bilingual dictionary:&lt;br /&gt;
&lt;br /&gt;
* Stems: &amp;#039;&amp;#039;&amp;#039;1,802&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
* Wikipedia naïve coverage: &amp;#039;&amp;#039;&amp;#039;72.61%&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
* Bible naïve coverage: &amp;#039;&amp;#039;&amp;#039;90.50%&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
* WER of [https://github.com/apertium/apertium-eng-lin here](Lin-Eng): &amp;#039;&amp;#039;&amp;#039;47.15%&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
* WER of same story as the above(Eng-Lin): &amp;#039;&amp;#039;&amp;#039;50.93%&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
* Lexical selection rules: &amp;#039;&amp;#039;&amp;#039;~30&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
* Testvoc(Wikipedia corpus lin-eng):&lt;br /&gt;
** Number of tokenised words in the corpus: &amp;#039;&amp;#039;&amp;#039;589,666&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
** Number of tokenised words unknown to analyser:   147,573  —  &amp;#039;&amp;#039;&amp;#039;25.0%&amp;#039;&amp;#039;&amp;#039; of tokens had *&lt;br /&gt;
** Tokenised words unknown to bidix:0  —   &amp;#039;&amp;#039;&amp;#039;0.0%&amp;#039;&amp;#039;&amp;#039; of tokens had @&lt;br /&gt;
** Tokenised words w/transfer errors or unknown to generator:   12,171  —   &amp;#039;&amp;#039;&amp;#039;2.1%&amp;#039;&amp;#039;&amp;#039; of tokens had #&lt;br /&gt;
** Error-free coverage of analyser only:            442,093  —  &amp;#039;&amp;#039;&amp;#039;75.0%&amp;#039;&amp;#039;&amp;#039; of tokens had no *&lt;br /&gt;
** Error-free coverage of analyser and bidix:       442,093  —  &amp;#039;&amp;#039;&amp;#039;75.0%&amp;#039;&amp;#039;&amp;#039; of tokens had no */@&lt;br /&gt;
** Error-free coverage of the full translator:      429,922  —  &amp;#039;&amp;#039;&amp;#039;72.9%&amp;#039;&amp;#039;&amp;#039; of tokens had no */@/#&lt;br /&gt;
&lt;br /&gt;
Translating verbs proved to be the most difficult thing to do(resulted in most #). Lingala verbs contain in them the person,tense,mood,number, and animacy.&lt;br /&gt;
&lt;br /&gt;
==Future Work==&lt;br /&gt;
* Transfer rules for verbs that deal with tense, mood, compound and radical extension.&lt;br /&gt;
* Second-level rules(.t2x) for alliterative agreement which will result in a more literary Lingala translation.&lt;br /&gt;
* Using offline resources to get vocabulary&lt;br /&gt;
&lt;br /&gt;
==Acknowledgments==&lt;br /&gt;
I would like to thank the whole Apertium community, specifically, my mentors, Jonathan Washington, Mikel L. Forcada, and Анастасия Кузнецова for their support, mentorship, and patience.&lt;/div&gt;</summary>
		<author><name>Eden</name></author>
		
	</entry>
</feed>