<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.apertium.org/w/index.php?action=history&amp;feed=atom&amp;title=Task_ideas_for_Google_Code-in%2FGrow_bilingual</id>
	<title>Task ideas for Google Code-in/Grow bilingual - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.apertium.org/w/index.php?action=history&amp;feed=atom&amp;title=Task_ideas_for_Google_Code-in%2FGrow_bilingual"/>
	<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Task_ideas_for_Google_Code-in/Grow_bilingual&amp;action=history"/>
	<updated>2026-05-05T17:03:57Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.34.1</generator>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Task_ideas_for_Google_Code-in/Grow_bilingual&amp;diff=71058&amp;oldid=prev</id>
		<title>Tino Didriksen: /* How to find the most frequent unknowns */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Task_ideas_for_Google_Code-in/Grow_bilingual&amp;diff=71058&amp;oldid=prev"/>
		<updated>2020-01-19T15:01:26Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;How to find the most frequent unknowns&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 15:01, 19 January 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 11:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 11:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;           ones with * at the start, sort, count number of hits per word, sort&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;           ones with * at the start, sort, count number of hits per word, sort&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;           again&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;           again&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;&amp;lt;Unhammer&amp;gt; e.g.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;&amp;lt;Unhammer&amp;gt; e.g.                                                         [11:25]&lt;/div&gt;&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-added&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;&amp;lt;Unhammer&amp;gt; zcat corpus.txt.gz | apertium -d . ron-fra | tr &#039; &#039; &#039;\n&#039; | grep&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;&amp;lt;Unhammer&amp;gt; zcat corpus.txt.gz | apertium -d . ron-fra | tr &#039; &#039; &#039;\n&#039; | grep&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;           &#039;^\*&#039; | sort |uniq -&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;d&lt;/del&gt; |sort -n &amp;gt;hitlist&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;           &#039;^\*&#039; | sort |&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; &lt;/ins&gt;uniq -&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;c&lt;/ins&gt; |&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; &lt;/ins&gt;sort -n &amp;gt;hitlist&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;&amp;lt;asusAndrei&amp;gt; awesome!&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;                                                   [11:26]&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;&amp;lt;asusAndrei&amp;gt; awesome!&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;&amp;lt;Unhammer&amp;gt; hitlist will be unknowns sorted by frequency, but you might have to&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;&amp;lt;Unhammer&amp;gt; hitlist will be unknowns sorted by frequency, but you might have to&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;           skip a couple that are &quot;strange&quot; or difficult to add&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;           skip a couple that are &quot;strange&quot; or difficult to add&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Tino Didriksen</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Task_ideas_for_Google_Code-in/Grow_bilingual&amp;diff=70596&amp;oldid=prev</id>
		<title>Firespeaker at 18:29, 8 November 2019</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Task_ideas_for_Google_Code-in/Grow_bilingual&amp;diff=70596&amp;oldid=prev"/>
		<updated>2019-11-08T18:29:39Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 18:29, 8 November 2019&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Select a language pair&#039;&#039;&#039;, ideally such that the source language is a language you know (L₂) and the target language a language you use every day (L₁), such that it has rather good monolingual dictionaries in Apertium but no reasonable bilingual dictionary (these language pairs are usually in the incubator), for instance apertium-spa-pol&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Select a language pair&#039;&#039;&#039;, ideally such that the source language is a language you know (L₂) and the target language a language you use every day (L₁), such that it has rather good monolingual dictionaries in Apertium but no reasonable bilingual dictionary (these language pairs are usually in the incubator), for instance apertium-spa-pol&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Install Apertium&#039;&#039;&#039; locally from nightlies [[Installation#Installing:_a_summary|instructions here]]; clone the relevant language modules and pair from GitHub; make sure that it works.  Alternatively, get [http://wiki.apertium.org/wiki/Apertium_VirtualBox Apertium VirtualBox] and update, check out &amp;amp; compile the language pair. &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Install Apertium&#039;&#039;&#039; locally from nightlies &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;(&lt;/ins&gt;[[Installation#Installing:_a_summary|instructions here]]&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;)&lt;/ins&gt;; clone the relevant language modules and pair from GitHub; make sure that it works.  Alternatively, get [http://wiki.apertium.org/wiki/Apertium_VirtualBox Apertium VirtualBox] and update, check out &amp;amp; compile the language pair. &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# Using a large enough corpus of representative text in the source language (e.g. plain text taken from Wikipedia, newspapers, literature, etc.) &#039;&#039;&#039;detect the 200 most frequent unknown words&#039;&#039;&#039; (words in the source document which are not in the bilingual dictionaries of the language pair).  See below for information about how to do this.  Note: the beginner version of this task only requires 50 words.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# Using a large enough corpus of representative text in the source language (e.g. plain text taken from Wikipedia, newspapers, literature, etc.) &#039;&#039;&#039;detect the 200 most frequent unknown words&#039;&#039;&#039; (words in the source document which are not in the bilingual dictionaries of the language pair).  See below for information about how to do this.  Note: the beginner version of this task only requires 50 words.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Add these correspondences to the bilingual dictionary&#039;&#039;&#039; (the appropriate &amp;lt;code&amp;gt;.dix&amp;lt;/code&amp;gt; file) in [[bidix]] format (so that they are not unknown anymore), as well as the monolingual analysers if needed.  Make sure to categorise stems correctly.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Add these correspondences to the bilingual dictionary&#039;&#039;&#039; (the appropriate &amp;lt;code&amp;gt;.dix&amp;lt;/code&amp;gt; file) in [[bidix]] format (so that they are not unknown anymore), as well as the monolingual analysers if needed.  Make sure to categorise stems correctly.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Firespeaker</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Task_ideas_for_Google_Code-in/Grow_bilingual&amp;diff=70595&amp;oldid=prev</id>
		<title>Firespeaker at 18:29, 8 November 2019</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Task_ideas_for_Google_Code-in/Grow_bilingual&amp;diff=70595&amp;oldid=prev"/>
		<updated>2019-11-08T18:29:22Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 18:29, 8 November 2019&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Select a language pair&#039;&#039;&#039;, ideally such that the source language is a language you know (L₂) and the target language a language you use every day (L₁), such that it has rather good monolingual dictionaries in Apertium but no reasonable bilingual dictionary (these language pairs are usually in the incubator), for instance apertium-spa-pol&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Select a language pair&#039;&#039;&#039;, ideally such that the source language is a language you know (L₂) and the target language a language you use every day (L₁), such that it has rather good monolingual dictionaries in Apertium but no reasonable bilingual dictionary (these language pairs are usually in the incubator), for instance apertium-spa-pol&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Install Apertium&#039;&#039;&#039; locally from nightlies; clone the relevant language modules and pair from GitHub; make sure that it works.  Alternatively, get [http://wiki.apertium.org/wiki/Apertium_VirtualBox Apertium VirtualBox] and update, check out &amp;amp; compile the language pair. &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Install Apertium&#039;&#039;&#039; locally from nightlies&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; [[Installation#Installing:_a_summary|instructions here]]&lt;/ins&gt;; clone the relevant language modules and pair from GitHub; make sure that it works.  Alternatively, get [http://wiki.apertium.org/wiki/Apertium_VirtualBox Apertium VirtualBox] and update, check out &amp;amp; compile the language pair. &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;# Using a large enough corpus of representative text in the source language (e.g. plain text taken from Wikipedia, newspapers, literature, etc.) &#039;&#039;&#039;detect the 200 most frequent unknown words&#039;&#039;&#039; (source &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;words&lt;/del&gt; which are not&lt;del class=&quot;diffchange diffchange-inline&quot;&gt; &lt;/del&gt; in the bilingual dictionaries of the language pair).  See &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;# Using a large enough corpus of representative text in the source language (e.g. plain text taken from Wikipedia, newspapers, literature, etc.) &#039;&#039;&#039;detect the 200 most frequent unknown words&#039;&#039;&#039; (&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;words in the &lt;/ins&gt;source &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;document&lt;/ins&gt; which are not in the bilingual dictionaries of the language pair).  See &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;below for information about how to do this.  Note: the beginner version of this task only requires 50 words.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Add these correspondences to the bilingual dictionary&#039;&#039;&#039; (the appropriate &amp;lt;code&amp;gt;.dix&amp;lt;/code&amp;gt; file) in [[bidix]] format (so that they are not unknown anymore), as well as the monolingual analysers if needed.  Make sure to categorise stems correctly.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Add these correspondences to the bilingual dictionary&#039;&#039;&#039; (the appropriate &amp;lt;code&amp;gt;.dix&amp;lt;/code&amp;gt; file) in [[bidix]] format (so that they are not unknown anymore), as well as the monolingual analysers if needed.  Make sure to categorise stems correctly.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Compile and test again&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Compile and test again&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Firespeaker</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Task_ideas_for_Google_Code-in/Grow_bilingual&amp;diff=70594&amp;oldid=prev</id>
		<title>Firespeaker at 18:22, 8 November 2019</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Task_ideas_for_Google_Code-in/Grow_bilingual&amp;diff=70594&amp;oldid=prev"/>
		<updated>2019-11-08T18:22:43Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 18:22, 8 November 2019&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;select&lt;/del&gt; a language pair&#039;&#039;&#039;, ideally such that the source language is a language you know (L₂) and the target language a language you use every day (L₁), such that it has rather good monolingual dictionaries in Apertium but no reasonable bilingual dictionary (these language pairs are usually in the incubator), for instance apertium-spa-pol&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;Select&lt;/ins&gt; a language pair&#039;&#039;&#039;, ideally such that the source language is a language you know (L₂) and the target language a language you use every day (L₁), such that it has rather good monolingual dictionaries in Apertium but no reasonable bilingual dictionary (these language pairs are usually in the incubator), for instance apertium-spa-pol&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Install Apertium&#039;&#039;&#039; locally from &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;the Subversion repository&lt;/del&gt;; &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;install&lt;/del&gt; the language pair; make sure that it works &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;and/or&lt;/del&gt; get [http://wiki.apertium.org/wiki/Apertium_VirtualBox Apertium VirtualBox] and update, check out &amp;amp; compile the language pair. &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Install Apertium&#039;&#039;&#039; locally from &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;nightlies&lt;/ins&gt;; &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;clone&lt;/ins&gt; the&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; relevant&lt;/ins&gt; language&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; modules and&lt;/ins&gt; pair&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; from GitHub&lt;/ins&gt;; make sure that it works&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;.&lt;/ins&gt; &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; Alternatively,&lt;/ins&gt; get [http://wiki.apertium.org/wiki/Apertium_VirtualBox Apertium VirtualBox] and update, check out &amp;amp; compile the language pair. &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;# Using a large enough corpus of representative text in the source language (e.g. plain text taken from Wikipedia, newspapers, literature, etc.) &#039;&#039;&#039;detect the 200 most frequent unknown words&#039;&#039;&#039; (source words which are not  in the bilingual dictionaries of the language pair). &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;# Using a large enough corpus of representative text in the source language (e.g. plain text taken from Wikipedia, newspapers, literature, etc.) &#039;&#039;&#039;detect the 200 most frequent unknown words&#039;&#039;&#039; (source words which are not  in the bilingual dictionaries of the language pair).&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;  See&lt;/ins&gt; &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;add&lt;/del&gt; these correspondences to the bilingual dictionary&#039;&#039;&#039; (the appropriate &amp;lt;code&amp;gt;.dix&amp;lt;/code&amp;gt; file) in [[bidix]] format (so that they are not unknown anymore). &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;Add&lt;/ins&gt; these correspondences to the bilingual dictionary&#039;&#039;&#039; (the appropriate &amp;lt;code&amp;gt;.dix&amp;lt;/code&amp;gt; file) in [[bidix]] format (so that they are not unknown anymore)&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;, as well as the monolingual analysers if needed&lt;/ins&gt;. &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; Make sure to categorise stems correctly.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Compile and test again&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Compile and test again&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Submit&#039;&#039;&#039; a &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;patch&lt;/del&gt; to &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;your mentor (or commit it if you have already gained&lt;/del&gt; &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;developer&lt;/del&gt; &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;access)&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Submit&#039;&#039;&#039; a &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;pull request&lt;/ins&gt; to &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;the&lt;/ins&gt; &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;GitHub&lt;/ins&gt; &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;repositories&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;==How to find the most frequent unknowns==&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;&amp;lt;pre&amp;gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;&amp;lt;Unhammer&amp;gt; translate your corpus, make it one word per line, grab only the&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;           ones with * at the start, sort, count number of hits per word, sort&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;           again&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;&amp;lt;Unhammer&amp;gt; e.g.                                                         [11:25]&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;&amp;lt;Unhammer&amp;gt; zcat corpus.txt.gz | apertium -d . ron-fra | tr &#039; &#039; &#039;\n&#039; | grep&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;           &#039;^\*&#039; | sort |uniq -d |sort -n &amp;gt;hitlist&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;&amp;lt;asusAndrei&amp;gt; awesome!                                                   [11:26]&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;&amp;lt;Unhammer&amp;gt; hitlist will be unknowns sorted by frequency, but you might have to&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;           skip a couple that are &quot;strange&quot; or difficult to add&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;&amp;lt;Unhammer&amp;gt; and that&#039;s ok, as long as you start from the most frequent and work&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;           your way down&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;&amp;lt;/pre&amp;gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;[[Category:Tasks for Google Code-in|Grow bilingual]]&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;[[Category:Tasks for Google Code-in|Grow bilingual]]&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Firespeaker</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Task_ideas_for_Google_Code-in/Grow_bilingual&amp;diff=65028&amp;oldid=prev</id>
		<title>Firespeaker at 03:25, 21 December 2017</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Task_ideas_for_Google_Code-in/Grow_bilingual&amp;diff=65028&amp;oldid=prev"/>
		<updated>2017-12-21T03:25:05Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 03:25, 21 December 2017&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 2:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 2:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Install Apertium&#039;&#039;&#039; locally from the Subversion repository; install the language pair; make sure that it works and/or get [http://wiki.apertium.org/wiki/Apertium_VirtualBox Apertium VirtualBox] and update, check out &amp;amp; compile the language pair. &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Install Apertium&#039;&#039;&#039; locally from the Subversion repository; install the language pair; make sure that it works and/or get [http://wiki.apertium.org/wiki/Apertium_VirtualBox Apertium VirtualBox] and update, check out &amp;amp; compile the language pair. &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# Using a large enough corpus of representative text in the source language (e.g. plain text taken from Wikipedia, newspapers, literature, etc.) &#039;&#039;&#039;detect the 200 most frequent unknown words&#039;&#039;&#039; (source words which are not  in the bilingual dictionaries of the language pair). &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# Using a large enough corpus of representative text in the source language (e.g. plain text taken from Wikipedia, newspapers, literature, etc.) &#039;&#039;&#039;detect the 200 most frequent unknown words&#039;&#039;&#039; (source words which are not  in the bilingual dictionaries of the language pair). &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;add these correspondences to the bilingual dictionary&#039;&#039;&#039; (so that they are not unknown anymore). &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;add these correspondences to the bilingual dictionary&#039;&#039;&#039;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; (the appropriate &amp;lt;code&amp;gt;.dix&amp;lt;/code&amp;gt; file) in [[bidix]] format&lt;/ins&gt; (so that they are not unknown anymore). &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Compile and test again&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Compile and test again&#039;&#039;&#039;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Submit&#039;&#039;&#039; a patch to your mentor (or commit it if you have already gained developer access)&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# &#039;&#039;&#039;Submit&#039;&#039;&#039; a patch to your mentor (or commit it if you have already gained developer access)&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Firespeaker</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Task_ideas_for_Google_Code-in/Grow_bilingual&amp;diff=64941&amp;oldid=prev</id>
		<title>Firespeaker at 04:51, 14 December 2017</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Task_ideas_for_Google_Code-in/Grow_bilingual&amp;diff=64941&amp;oldid=prev"/>
		<updated>2017-12-14T04:51:29Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 04:51, 14 December 2017&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;# select a language pair, ideally such that the source language is a language you know (L₂) and the target language a language you use every day (L₁), such that it has rather good monolingual dictionaries in Apertium but no reasonable bilingual dictionary (these language pairs are usually in the incubator), for instance apertium-spa-pol&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;# &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&#039;&#039;&#039;&lt;/ins&gt;select a language pair&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&#039;&#039;&#039;&lt;/ins&gt;, ideally such that the source language is a language you know (L₂) and the target language a language you use every day (L₁), such that it has rather good monolingual dictionaries in Apertium but no reasonable bilingual dictionary (these language pairs are usually in the incubator), for instance apertium-spa-pol&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;# Install Apertium locally from the Subversion repository; install the language pair; make sure that it works and/or get [http://wiki.apertium.org/wiki/Apertium_VirtualBox Apertium VirtualBox] and update, check out &amp;amp; compile the language pair. &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;# &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&#039;&#039;&#039;&lt;/ins&gt;Install Apertium&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&#039;&#039;&#039;&lt;/ins&gt; locally from the Subversion repository; install the language pair; make sure that it works and/or get [http://wiki.apertium.org/wiki/Apertium_VirtualBox Apertium VirtualBox] and update, check out &amp;amp; compile the language pair. &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;# Using a large enough corpus of representative text in the source language (e.g. plain text taken from Wikipedia, newspapers, literature, etc.) detect the &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;50&lt;/del&gt; most frequent unknown words (source words which are not  in the bilingual dictionaries of the language pair). &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;# Using a large enough corpus of representative text in the source language (e.g. plain text taken from Wikipedia, newspapers, literature, etc.) &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&#039;&#039;&#039;&lt;/ins&gt;detect the &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;200&lt;/ins&gt; most frequent unknown words&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&#039;&#039;&#039;&lt;/ins&gt; (source words which are not  in the bilingual dictionaries of the language pair). &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;# add these correspondences to the bilingual dictionary (so that they are not unknown anymore). &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;# &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&#039;&#039;&#039;&lt;/ins&gt;add these correspondences to the bilingual dictionary&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&#039;&#039;&#039;&lt;/ins&gt; (so that they are not unknown anymore). &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;# Compile and test again&lt;del class=&quot;diffchange diffchange-inline&quot;&gt; &lt;/del&gt;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;# &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&#039;&#039;&#039;&lt;/ins&gt;Compile and test again&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&#039;&#039;&#039;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;# Submit a patch to your mentor (or commit it if you have already gained developer access)&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;# &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&#039;&#039;&#039;&lt;/ins&gt;Submit&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&#039;&#039;&#039;&lt;/ins&gt; a patch to your mentor (or commit it if you have already gained developer access)&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;[[Category:Tasks for Google Code-in|Grow bilingual]]&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;[[Category:Tasks for Google Code-in|Grow bilingual]]&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Firespeaker</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Task_ideas_for_Google_Code-in/Grow_bilingual&amp;diff=50869&amp;oldid=prev</id>
		<title>Mlforcada: Created page with &quot;# select a language pair, ideally such that the source language is a language you know (L₂) and the target language a language you use every day (L₁), such that it has rat...&quot;</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Task_ideas_for_Google_Code-in/Grow_bilingual&amp;diff=50869&amp;oldid=prev"/>
		<updated>2014-11-07T12:07:23Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;# select a language pair, ideally such that the source language is a language you know (L₂) and the target language a language you use every day (L₁), such that it has rat...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;# select a language pair, ideally such that the source language is a language you know (L₂) and the target language a language you use every day (L₁), such that it has rather good monolingual dictionaries in Apertium but no reasonable bilingual dictionary (these language pairs are usually in the incubator), for instance apertium-spa-pol&lt;br /&gt;
# Install Apertium locally from the Subversion repository; install the language pair; make sure that it works and/or get [http://wiki.apertium.org/wiki/Apertium_VirtualBox Apertium VirtualBox] and update, check out &amp;amp; compile the language pair. &lt;br /&gt;
# Using a large enough corpus of representative text in the source language (e.g. plain text taken from Wikipedia, newspapers, literature, etc.) detect the 50 most frequent unknown words (source words which are not  in the bilingual dictionaries of the language pair). &lt;br /&gt;
# add these correspondences to the bilingual dictionary (so that they are not unknown anymore). &lt;br /&gt;
# Compile and test again &lt;br /&gt;
# Submit a patch to your mentor (or commit it if you have already gained developer access)&lt;br /&gt;
&lt;br /&gt;
[[Category:Tasks for Google Code-in|Grow bilingual]]&lt;/div&gt;</summary>
		<author><name>Mlforcada</name></author>
		
	</entry>
</feed>