<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.apertium.org/w/index.php?action=history&amp;feed=atom&amp;title=Ideas_for_Google_Summer_of_Code%2FEliminate_trimming</id>
	<title>Ideas for Google Summer of Code/Eliminate trimming - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.apertium.org/w/index.php?action=history&amp;feed=atom&amp;title=Ideas_for_Google_Summer_of_Code%2FEliminate_trimming"/>
	<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Ideas_for_Google_Summer_of_Code/Eliminate_trimming&amp;action=history"/>
	<updated>2026-05-05T17:36:29Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.34.1</generator>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Ideas_for_Google_Summer_of_Code/Eliminate_trimming&amp;diff=72113&amp;oldid=prev</id>
		<title>Ilnar.salimzyan: un typo</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Ideas_for_Google_Summer_of_Code/Eliminate_trimming&amp;diff=72113&amp;oldid=prev"/>
		<updated>2020-04-23T10:26:50Z</updated>

		<summary type="html">&lt;p&gt;un typo&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 10:26, 23 April 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;Dictionary trimming is a thing in apertium where we remove stuff from monolingual language models (FSTs compiled from monodixes) so they only contain word-forms that the translation model (FST compiled of bidix) knows of. The practical effect of this is that words missing from bidix are treated the same as words missing from monodix, making debugging harder. This is &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;workaroudable&lt;/del&gt; by compiling trimmed and untrimmed FSAs separately for debug and development process and adding more modes and trying to remember which modes go with which but it&#039;s error-prone and unmanageable. Furthermore, throwing good information away early is not a good thing, even when bidix is missing some stuff other parts of the pipeline may benefit from the stuff that got thrown out. Ideally we would want to keep maximal amount of stuff intact and usable and only programmatically select what is displayed when: source language or target language, word-form or lemma...&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;Dictionary trimming is a thing in apertium where we remove stuff from monolingual language models (FSTs compiled from monodixes) so they only contain word-forms that the translation model (FST compiled of bidix) knows of. The practical effect of this is that words missing from bidix are treated the same as words missing from monodix, making debugging harder. This is &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;workaroundable&lt;/ins&gt; by compiling trimmed and untrimmed FSAs separately for debug and development process and adding more modes and trying to remember which modes go with which but it&#039;s error-prone and unmanageable. Furthermore, throwing good information away early is not a good thing, even when bidix is missing some stuff other parts of the pipeline may benefit from the stuff that got thrown out. Ideally we would want to keep maximal amount of stuff intact and usable and only programmatically select what is displayed when: source language or target language, word-form or lemma...&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Ilnar.salimzyan</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Ideas_for_Google_Summer_of_Code/Eliminate_trimming&amp;diff=71363&amp;oldid=prev</id>
		<title>Popcorndude: categorize</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Ideas_for_Google_Summer_of_Code/Eliminate_trimming&amp;diff=71363&amp;oldid=prev"/>
		<updated>2020-03-24T19:51:59Z</updated>

		<summary type="html">&lt;p&gt;categorize&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 19:51, 24 March 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 14:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 14:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* [[Why we trim]]&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* [[Why we trim]]&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* Unhammer et al. (20xx) Trimming...&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* Unhammer et al. (20xx) Trimming...&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;[[Category:Ideas_for_Google_Summer_of_Code]]&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Popcorndude</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Ideas_for_Google_Summer_of_Code/Eliminate_trimming&amp;diff=71256&amp;oldid=prev</id>
		<title>TommiPirinen: /* Further reading */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Ideas_for_Google_Summer_of_Code/Eliminate_trimming&amp;diff=71256&amp;oldid=prev"/>
		<updated>2020-03-20T16:00:12Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Further reading&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 16:00, 20 March 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 12:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 12:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;== Further reading ==&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;== Further reading ==&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;* [[Why we trim]]&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* Unhammer et al. (20xx) Trimming...&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* Unhammer et al. (20xx) Trimming...&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>TommiPirinen</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Ideas_for_Google_Summer_of_Code/Eliminate_trimming&amp;diff=71255&amp;oldid=prev</id>
		<title>TommiPirinen: /* Coding challenge */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Ideas_for_Google_Summer_of_Code/Eliminate_trimming&amp;diff=71255&amp;oldid=prev"/>
		<updated>2020-03-20T15:54:59Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Coding challenge&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 15:54, 20 March 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 8:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 8:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* Check out different trimming methods in apertium pairs, including HFST trimming and lttoolbox trimming (Norwegian, North Sámi...)&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* Check out different trimming methods in apertium pairs, including HFST trimming and lttoolbox trimming (Norwegian, North Sámi...)&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;* Figure out and explain trimming in https://github.com/apertium/apertium-sme-nob/ (hint: Makefile.am)&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-added&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;== Further reading ==&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;== Further reading ==&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>TommiPirinen</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Ideas_for_Google_Summer_of_Code/Eliminate_trimming&amp;diff=71252&amp;oldid=prev</id>
		<title>TommiPirinen: Created page with &quot;Dictionary trimming is a thing in apertium where we remove stuff from monolingual language models (FSTs compiled from monodixes) so they only contain word-forms that the trans...&quot;</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Ideas_for_Google_Summer_of_Code/Eliminate_trimming&amp;diff=71252&amp;oldid=prev"/>
		<updated>2020-03-20T14:55:20Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;Dictionary trimming is a thing in apertium where we remove stuff from monolingual language models (FSTs compiled from monodixes) so they only contain word-forms that the trans...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;Dictionary trimming is a thing in apertium where we remove stuff from monolingual language models (FSTs compiled from monodixes) so they only contain word-forms that the translation model (FST compiled of bidix) knows of. The practical effect of this is that words missing from bidix are treated the same as words missing from monodix, making debugging harder. This is workaroudable by compiling trimmed and untrimmed FSAs separately for debug and development process and adding more modes and trying to remember which modes go with which but it&amp;#039;s error-prone and unmanageable. Furthermore, throwing good information away early is not a good thing, even when bidix is missing some stuff other parts of the pipeline may benefit from the stuff that got thrown out. Ideally we would want to keep maximal amount of stuff intact and usable and only programmatically select what is displayed when: source language or target language, word-form or lemma...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Task ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Coding challenge ==&lt;br /&gt;
&lt;br /&gt;
* Check out different trimming methods in apertium pairs, including HFST trimming and lttoolbox trimming (Norwegian, North Sámi...)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Further reading ==&lt;br /&gt;
&lt;br /&gt;
* Unhammer et al. (20xx) Trimming...&lt;/div&gt;</summary>
		<author><name>TommiPirinen</name></author>
		
	</entry>
</feed>