<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.apertium.org/w/index.php?action=history&amp;feed=atom&amp;title=Why_we_trim</id>
	<title>Why we trim - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.apertium.org/w/index.php?action=history&amp;feed=atom&amp;title=Why_we_trim"/>
	<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Why_we_trim&amp;action=history"/>
	<updated>2026-05-05T17:28:10Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.34.1</generator>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Why_we_trim&amp;diff=72152&amp;oldid=prev</id>
		<title>Unhammer: /* See also */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Why_we_trim&amp;diff=72152&amp;oldid=prev"/>
		<updated>2020-05-03T16:06:41Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;See also&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 16:06, 3 May 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 23:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 23:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* [[Testvoc]]&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* [[Testvoc]]&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* [http://wiki.apertium.eu/index.php/Session_7 Session 7: Data consistency and quality] on wiki.apertium.eu&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* [http://wiki.apertium.eu/index.php/Session_7 Session 7: Data consistency and quality] on wiki.apertium.eu&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;* http://tinodidriksen.com/pisg/freenode/logs/%23apertium/2020-05-03.log&lt;/div&gt;&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-added&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;[[Category:Quality control]]&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;[[Category:Quality control]]&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Unhammer</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Why_we_trim&amp;diff=72150&amp;oldid=prev</id>
		<title>Unhammer: /* See also */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Why_we_trim&amp;diff=72150&amp;oldid=prev"/>
		<updated>2020-05-03T11:26:18Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;See also&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 11:26, 3 May 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 23:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 23:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* [[Testvoc]]&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* [[Testvoc]]&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* [http://wiki.apertium.eu/index.php/Session_7 Session 7: Data consistency and quality] on wiki.apertium.eu&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* [http://wiki.apertium.eu/index.php/Session_7 Session 7: Data consistency and quality] on wiki.apertium.eu&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;* http://tinodidriksen.com/pisg/freenode/logs/%23apertium/2020-05-03.log&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;[[Category:Quality control]]&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;[[Category:Quality control]]&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Unhammer</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Why_we_trim&amp;diff=50604&amp;oldid=prev</id>
		<title>Bech: Link to French page</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Why_we_trim&amp;diff=50604&amp;oldid=prev"/>
		<updated>2014-10-08T08:34:47Z</updated>

		<summary type="html">&lt;p&gt;Link to French page&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 08:34, 8 October 2014&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;[[Pourquoi nous tronquons|En français]]&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;In Apertium language pairs, we keep the monolingual and bilingual dictionaries &#039;&#039;trimmed&#039;&#039;, so that all entries from the analyser will have some match in the bidix, and all output from transfer will have some entry in the generator.&amp;lt;ref&amp;gt;Typically this goes for both translation direction, although a language pair only released for one direction might only be trimmed in that direction.&amp;lt;/ref&amp;gt;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;In Apertium language pairs, we keep the monolingual and bilingual dictionaries &#039;&#039;trimmed&#039;&#039;, so that all entries from the analyser will have some match in the bidix, and all output from transfer will have some entry in the generator.&amp;lt;ref&amp;gt;Typically this goes for both translation direction, although a language pair only released for one direction might only be trimmed in that direction.&amp;lt;/ref&amp;gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Bech</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Why_we_trim&amp;diff=46746&amp;oldid=prev</id>
		<title>Unhammer: /* See also */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Why_we_trim&amp;diff=46746&amp;oldid=prev"/>
		<updated>2014-02-14T09:31:34Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;See also&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 09:31, 14 February 2014&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 18:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 18:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;==See also==&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;==See also==&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* [[Automatically trimming a monodix]]&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* [[Automatically trimming a monodix]]&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;* [[lt-trim]]&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* [[Testvoc]]&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* [[Testvoc]]&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;* [http://wiki.apertium.eu/index.php/Session_7 Session 7: Data consistency and quality] on wiki.apertium.eu&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;* [http://wiki.apertium.eu/index.php/Session_7 Session 7: Data consistency and quality] on wiki.apertium.eu&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Unhammer</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Why_we_trim&amp;diff=36909&amp;oldid=prev</id>
		<title>Unhammer at 13:13, 16 October 2012</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Why_we_trim&amp;diff=36909&amp;oldid=prev"/>
		<updated>2012-10-16T13:13:40Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 13:13, 16 October 2012&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 6:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 6:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# Transfer rules quite often use target language information from bidix to fill in tags etc. If transfer from English to Spanish reads a chunk like &quot;the children&quot;, the Spanish determiner needs to get the number and gender information from the &#039;&#039;target language&#039;&#039; noun. It is not enough to look at the output of the source language analyser, number can be changed by bidix for certain nouns, and gender is not even present in the source language. The transfer rule expects to have this information; without it, not only will the noun be output as @lemma, but the determiner will not be generated correctly either. This effect gets even worse with bigger chunks.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# Transfer rules quite often use target language information from bidix to fill in tags etc. If transfer from English to Spanish reads a chunk like &quot;the children&quot;, the Spanish determiner needs to get the number and gender information from the &#039;&#039;target language&#039;&#039; noun. It is not enough to look at the output of the source language analyser, number can be changed by bidix for certain nouns, and gender is not even present in the source language. The transfer rule expects to have this information; without it, not only will the noun be output as @lemma, but the determiner will not be generated correctly either. This effect gets even worse with bigger chunks.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;#* One might work around this by having exceptions in the transfer rules to e.g. guess number and gender if bidix doesn&#039;t give any, but this leads to an enormous increase in transfer complexity – all tags have to be presumed to be unknown, and developer time is wasted on bug-hunting and workarounds instead of improving translation quality.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;#* One might work around this by having exceptions in the transfer rules to e.g. guess number and gender if bidix doesn&#039;t give any, but this leads to an enormous increase in transfer complexity – all tags have to be presumed to be unknown, and developer time is wasted on bug-hunting and workarounds instead of improving translation quality.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;# Although there could be a technical solution&lt;del class=&quot;diffchange diffchange-inline&quot;&gt; (&amp;lt;code&amp;gt;lt-proc -o&amp;lt;/code&amp;gt;)&lt;/del&gt; to carrying over the source word if it&#039;s not in the bidix, this leads to problems with compounds and other multiwords that are split into two lexical units before bidix lookup: What do you do when part of a multiword is unknown? For example, if we have ^writes about/write&amp;amp;lt;vblex&amp;amp;gt;+about&amp;amp;lt;pr&amp;amp;gt;$, this is currently split before bidix lookup into two units ^write&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;$, without lemmas, and if only one is unknown after bidix lookup, the other will still translate: ^write&amp;amp;lt;vblex&amp;amp;gt;/escribir&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;/@about&amp;amp;lt;pr&amp;amp;gt;$. If, on the other hand, we were to keep the surface form around, we would also have keep it as one unit in bidix lookup, such that if parts of the multiword were unknown, all of it would be marked unknown, giving something like ^@writes about/write&amp;amp;lt;vblex&amp;amp;gt;+@about&amp;amp;lt;pr&amp;amp;gt;$. &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;# Although there could be a technical solution to carrying over the source word if it&#039;s not in the bidix&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; (&amp;lt;code&amp;gt;lt-proc -o&amp;lt;/code&amp;gt;)&lt;/ins&gt;, this leads to problems with compounds and other multiwords that are split into two lexical units before bidix lookup: What do you do when part of a multiword is unknown? For example, if we have ^writes about/write&amp;amp;lt;vblex&amp;amp;gt;+about&amp;amp;lt;pr&amp;amp;gt;$, this is currently split before bidix lookup into two units ^write&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;$, without lemmas, and if only one is unknown after bidix lookup, the other will still translate: ^write&amp;amp;lt;vblex&amp;amp;gt;/escribir&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;/@about&amp;amp;lt;pr&amp;amp;gt;$. If, on the other hand, we were to keep the surface form around, we would also have keep it as one unit in bidix lookup, such that if parts of the multiword were unknown, all of it would be marked unknown, giving something like ^@writes about/write&amp;amp;lt;vblex&amp;amp;gt;+@about&amp;amp;lt;pr&amp;amp;gt;$. &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;#* Can&#039;t you just distribute the surface form over the two units? ^writes/write&amp;amp;lt;vblex&amp;amp;gt;$ ^about/about&amp;amp;lt;pr&amp;amp;gt;$! While in this constructed example, the split was at a space, it could be anywhere. The surface form gives no general indication of where. We have multiwords that split in the middle of contractions (^au/à&amp;amp;lt;pr&amp;amp;gt;+le&amp;amp;lt;det&amp;amp;gt;&amp;amp;lt;def&amp;amp;gt;&amp;amp;lt;m&amp;amp;gt;&amp;amp;lt;sg&amp;amp;gt;$), or in the middle of compunds (^vasskokaren/vatn&amp;amp;lt;n&amp;amp;gt;+kokar&amp;amp;lt;n&amp;amp;gt;$)&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;#* Can&#039;t you just distribute the surface form over the two units? ^writes/write&amp;amp;lt;vblex&amp;amp;gt;$ ^about/about&amp;amp;lt;pr&amp;amp;gt;$! While in this constructed example, the split was at a space, it could be anywhere. The surface form gives no general indication of where. We have multiwords that split in the middle of contractions (^au/à&amp;amp;lt;pr&amp;amp;gt;+le&amp;amp;lt;det&amp;amp;gt;&amp;amp;lt;def&amp;amp;gt;&amp;amp;lt;m&amp;amp;gt;&amp;amp;lt;sg&amp;amp;gt;$), or in the middle of compunds (^vasskokaren/vatn&amp;amp;lt;n&amp;amp;gt;+kokar&amp;amp;lt;n&amp;amp;gt;$)&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Unhammer</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Why_we_trim&amp;diff=36908&amp;oldid=prev</id>
		<title>Unhammer at 13:13, 16 October 2012</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Why_we_trim&amp;diff=36908&amp;oldid=prev"/>
		<updated>2012-10-16T13:13:22Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 13:13, 16 October 2012&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 6:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 6:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# Transfer rules quite often use target language information from bidix to fill in tags etc. If transfer from English to Spanish reads a chunk like &quot;the children&quot;, the Spanish determiner needs to get the number and gender information from the &#039;&#039;target language&#039;&#039; noun. It is not enough to look at the output of the source language analyser, number can be changed by bidix for certain nouns, and gender is not even present in the source language. The transfer rule expects to have this information; without it, not only will the noun be output as @lemma, but the determiner will not be generated correctly either. This effect gets even worse with bigger chunks.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# Transfer rules quite often use target language information from bidix to fill in tags etc. If transfer from English to Spanish reads a chunk like &quot;the children&quot;, the Spanish determiner needs to get the number and gender information from the &#039;&#039;target language&#039;&#039; noun. It is not enough to look at the output of the source language analyser, number can be changed by bidix for certain nouns, and gender is not even present in the source language. The transfer rule expects to have this information; without it, not only will the noun be output as @lemma, but the determiner will not be generated correctly either. This effect gets even worse with bigger chunks.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;#* One might work around this by having exceptions in the transfer rules to e.g. guess number and gender if bidix doesn&#039;t give any, but this leads to an enormous increase in transfer complexity – all tags have to be presumed to be unknown, and developer time is wasted on bug-hunting and workarounds instead of improving translation quality.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;#* One might work around this by having exceptions in the transfer rules to e.g. guess number and gender if bidix doesn&#039;t give any, but this leads to an enormous increase in transfer complexity – all tags have to be presumed to be unknown, and developer time is wasted on bug-hunting and workarounds instead of improving translation quality.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;# Although there could be a technical solution to carrying over the source word if it&#039;s not in the bidix, this leads to problems with compounds and other multiwords that are split into two lexical units before bidix lookup: What do you do when part of a multiword is unknown? For example, if we have ^writes about/write&amp;amp;lt;vblex&amp;amp;gt;+about&amp;amp;lt;pr&amp;amp;gt;$, this is currently split before bidix lookup into two units ^write&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;$, without lemmas, and if only one is unknown after bidix lookup, the other will still translate: ^write&amp;amp;lt;vblex&amp;amp;gt;/escribir&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;/@about&amp;amp;lt;pr&amp;amp;gt;$. If, on the other hand, we were to keep the surface form around, we would also have keep it as one unit in bidix lookup, such that if parts of the multiword were unknown, all of it would be marked unknown, giving something like ^@writes about/write&amp;amp;lt;vblex&amp;amp;gt;+@about&amp;amp;lt;pr&amp;amp;gt;$. &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;# Although there could be a technical solution&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; (&amp;lt;code&amp;gt;lt-proc -o&amp;lt;/code&amp;gt;)&lt;/ins&gt; to carrying over the source word if it&#039;s not in the bidix, this leads to problems with compounds and other multiwords that are split into two lexical units before bidix lookup: What do you do when part of a multiword is unknown? For example, if we have ^writes about/write&amp;amp;lt;vblex&amp;amp;gt;+about&amp;amp;lt;pr&amp;amp;gt;$, this is currently split before bidix lookup into two units ^write&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;$, without lemmas, and if only one is unknown after bidix lookup, the other will still translate: ^write&amp;amp;lt;vblex&amp;amp;gt;/escribir&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;/@about&amp;amp;lt;pr&amp;amp;gt;$. If, on the other hand, we were to keep the surface form around, we would also have keep it as one unit in bidix lookup, such that if parts of the multiword were unknown, all of it would be marked unknown, giving something like ^@writes about/write&amp;amp;lt;vblex&amp;amp;gt;+@about&amp;amp;lt;pr&amp;amp;gt;$. &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;#* Can&#039;t you just distribute the surface form over the two units? ^writes/write&amp;amp;lt;vblex&amp;amp;gt;$ ^about/about&amp;amp;lt;pr&amp;amp;gt;$! While in this constructed example, the split was at a space, it could be anywhere. The surface form gives no general indication of where. We have multiwords that split in the middle of contractions (^au/à&amp;amp;lt;pr&amp;amp;gt;+le&amp;amp;lt;det&amp;amp;gt;&amp;amp;lt;def&amp;amp;gt;&amp;amp;lt;m&amp;amp;gt;&amp;amp;lt;sg&amp;amp;gt;$), or in the middle of compunds (^vasskokaren/vatn&amp;amp;lt;n&amp;amp;gt;+kokar&amp;amp;lt;n&amp;amp;gt;$)&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;#* Can&#039;t you just distribute the surface form over the two units? ^writes/write&amp;amp;lt;vblex&amp;amp;gt;$ ^about/about&amp;amp;lt;pr&amp;amp;gt;$! While in this constructed example, the split was at a space, it could be anywhere. The surface form gives no general indication of where. We have multiwords that split in the middle of contractions (^au/à&amp;amp;lt;pr&amp;amp;gt;+le&amp;amp;lt;det&amp;amp;gt;&amp;amp;lt;def&amp;amp;gt;&amp;amp;lt;m&amp;amp;gt;&amp;amp;lt;sg&amp;amp;gt;$), or in the middle of compunds (^vasskokaren/vatn&amp;amp;lt;n&amp;amp;gt;+kokar&amp;amp;lt;n&amp;amp;gt;$)&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Unhammer</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Why_we_trim&amp;diff=36907&amp;oldid=prev</id>
		<title>Unhammer: bah</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Why_we_trim&amp;diff=36907&amp;oldid=prev"/>
		<updated>2012-10-16T13:11:36Z</updated>

		<summary type="html">&lt;p&gt;bah&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 13:11, 16 October 2012&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 7:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 7:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;#* One might work around this by having exceptions in the transfer rules to e.g. guess number and gender if bidix doesn&#039;t give any, but this leads to an enormous increase in transfer complexity – all tags have to be presumed to be unknown, and developer time is wasted on bug-hunting and workarounds instead of improving translation quality.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;#* One might work around this by having exceptions in the transfer rules to e.g. guess number and gender if bidix doesn&#039;t give any, but this leads to an enormous increase in transfer complexity – all tags have to be presumed to be unknown, and developer time is wasted on bug-hunting and workarounds instead of improving translation quality.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# Although there could be a technical solution to carrying over the source word if it&#039;s not in the bidix, this leads to problems with compounds and other multiwords that are split into two lexical units before bidix lookup: What do you do when part of a multiword is unknown? For example, if we have ^writes about/write&amp;amp;lt;vblex&amp;amp;gt;+about&amp;amp;lt;pr&amp;amp;gt;$, this is currently split before bidix lookup into two units ^write&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;$, without lemmas, and if only one is unknown after bidix lookup, the other will still translate: ^write&amp;amp;lt;vblex&amp;amp;gt;/escribir&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;/@about&amp;amp;lt;pr&amp;amp;gt;$. If, on the other hand, we were to keep the surface form around, we would also have keep it as one unit in bidix lookup, such that if parts of the multiword were unknown, all of it would be marked unknown, giving something like ^@writes about/write&amp;amp;lt;vblex&amp;amp;gt;+@about&amp;amp;lt;pr&amp;amp;gt;$. &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# Although there could be a technical solution to carrying over the source word if it&#039;s not in the bidix, this leads to problems with compounds and other multiwords that are split into two lexical units before bidix lookup: What do you do when part of a multiword is unknown? For example, if we have ^writes about/write&amp;amp;lt;vblex&amp;amp;gt;+about&amp;amp;lt;pr&amp;amp;gt;$, this is currently split before bidix lookup into two units ^write&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;$, without lemmas, and if only one is unknown after bidix lookup, the other will still translate: ^write&amp;amp;lt;vblex&amp;amp;gt;/escribir&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;/@about&amp;amp;lt;pr&amp;amp;gt;$. If, on the other hand, we were to keep the surface form around, we would also have keep it as one unit in bidix lookup, such that if parts of the multiword were unknown, all of it would be marked unknown, giving something like ^@writes about/write&amp;amp;lt;vblex&amp;amp;gt;+@about&amp;amp;lt;pr&amp;amp;gt;$. &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;#* Can&#039;t you just distribute the surface form over the two units? ^writes/write&amp;amp;lt;vblex&amp;amp;gt;$ ^about/about&amp;amp;lt;pr&amp;amp;gt;$! While in this constructed example, the split was at a space, it could be anywhere. The surface form gives no general indication of where. We have multiwords that split in the middle of contractions (^au/à&amp;amp;lt;pr&amp;amp;gt;+le&amp;amp;lt;det&amp;amp;gt;&amp;amp;lt;def&amp;amp;lt;m&amp;amp;gt;&amp;amp;lt;sg&amp;amp;gt;$), or in the middle of compunds (^vasskokaren/vatn&amp;amp;lt;n&amp;amp;gt;+kokar&amp;amp;lt;n&amp;amp;gt;$)&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;#* Can&#039;t you just distribute the surface form over the two units? ^writes/write&amp;amp;lt;vblex&amp;amp;gt;$ ^about/about&amp;amp;lt;pr&amp;amp;gt;$! While in this constructed example, the split was at a space, it could be anywhere. The surface form gives no general indication of where. We have multiwords that split in the middle of contractions (^au/à&amp;amp;lt;pr&amp;amp;gt;+le&amp;amp;lt;det&amp;amp;gt;&amp;amp;lt;def&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&amp;amp;gt;&lt;/ins&gt;&amp;amp;lt;m&amp;amp;gt;&amp;amp;lt;sg&amp;amp;gt;$), or in the middle of compunds (^vasskokaren/vatn&amp;amp;lt;n&amp;amp;gt;+kokar&amp;amp;lt;n&amp;amp;gt;$)&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Unhammer</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Why_we_trim&amp;diff=36906&amp;oldid=prev</id>
		<title>Unhammer at 13:11, 16 October 2012</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Why_we_trim&amp;diff=36906&amp;oldid=prev"/>
		<updated>2012-10-16T13:11:14Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 13:11, 16 October 2012&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 7:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 7:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;#* One might work around this by having exceptions in the transfer rules to e.g. guess number and gender if bidix doesn&#039;t give any, but this leads to an enormous increase in transfer complexity – all tags have to be presumed to be unknown, and developer time is wasted on bug-hunting and workarounds instead of improving translation quality.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;#* One might work around this by having exceptions in the transfer rules to e.g. guess number and gender if bidix doesn&#039;t give any, but this leads to an enormous increase in transfer complexity – all tags have to be presumed to be unknown, and developer time is wasted on bug-hunting and workarounds instead of improving translation quality.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# Although there could be a technical solution to carrying over the source word if it&#039;s not in the bidix, this leads to problems with compounds and other multiwords that are split into two lexical units before bidix lookup: What do you do when part of a multiword is unknown? For example, if we have ^writes about/write&amp;amp;lt;vblex&amp;amp;gt;+about&amp;amp;lt;pr&amp;amp;gt;$, this is currently split before bidix lookup into two units ^write&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;$, without lemmas, and if only one is unknown after bidix lookup, the other will still translate: ^write&amp;amp;lt;vblex&amp;amp;gt;/escribir&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;/@about&amp;amp;lt;pr&amp;amp;gt;$. If, on the other hand, we were to keep the surface form around, we would also have keep it as one unit in bidix lookup, such that if parts of the multiword were unknown, all of it would be marked unknown, giving something like ^@writes about/write&amp;amp;lt;vblex&amp;amp;gt;+@about&amp;amp;lt;pr&amp;amp;gt;$. &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# Although there could be a technical solution to carrying over the source word if it&#039;s not in the bidix, this leads to problems with compounds and other multiwords that are split into two lexical units before bidix lookup: What do you do when part of a multiword is unknown? For example, if we have ^writes about/write&amp;amp;lt;vblex&amp;amp;gt;+about&amp;amp;lt;pr&amp;amp;gt;$, this is currently split before bidix lookup into two units ^write&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;$, without lemmas, and if only one is unknown after bidix lookup, the other will still translate: ^write&amp;amp;lt;vblex&amp;amp;gt;/escribir&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;/@about&amp;amp;lt;pr&amp;amp;gt;$. If, on the other hand, we were to keep the surface form around, we would also have keep it as one unit in bidix lookup, such that if parts of the multiword were unknown, all of it would be marked unknown, giving something like ^@writes about/write&amp;amp;lt;vblex&amp;amp;gt;+@about&amp;amp;lt;pr&amp;amp;gt;$. &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;#* Can&#039;t you just distribute the surface form over the two units? ^writes/write&amp;amp;lt;vblex&amp;amp;gt;$ ^about/about&amp;amp;lt;pr&amp;amp;gt;$! While in this constructed example, the split was at a space, it could be anywhere. The surface form gives no general indication of where. We have multiwords that split in the middle of contractions (^au/à&amp;amp;lt;pr&amp;amp;gt;+le&amp;amp;lt;det&amp;amp;gt;&amp;amp;lt;def&amp;amp;lt;&amp;amp;gt;&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;m&lt;/del&gt;&amp;amp;lt;&amp;amp;gt&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;;sg&amp;amp;lt&lt;/del&gt;;$), or in the middle of compunds (^vasskokaren/vatn&amp;amp;lt;n&amp;amp;gt;+kokar&amp;amp;lt;n&amp;amp;gt;$)&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;#* Can&#039;t you just distribute the surface form over the two units? ^writes/write&amp;amp;lt;vblex&amp;amp;gt;$ ^about/about&amp;amp;lt;pr&amp;amp;gt;$! While in this constructed example, the split was at a space, it could be anywhere. The surface form gives no general indication of where. We have multiwords that split in the middle of contractions (^au/à&amp;amp;lt;pr&amp;amp;gt;+le&amp;amp;lt;det&amp;amp;gt;&amp;amp;lt;def&amp;amp;lt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;m&lt;/ins&gt;&amp;amp;gt;&amp;amp;lt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;sg&lt;/ins&gt;&amp;amp;gt;$), or in the middle of compunds (^vasskokaren/vatn&amp;amp;lt;n&amp;amp;gt;+kokar&amp;amp;lt;n&amp;amp;gt;$)&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Unhammer</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Why_we_trim&amp;diff=36905&amp;oldid=prev</id>
		<title>Unhammer at 13:10, 16 October 2012</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Why_we_trim&amp;diff=36905&amp;oldid=prev"/>
		<updated>2012-10-16T13:10:30Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 13:10, 16 October 2012&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 6:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 6:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# Transfer rules quite often use target language information from bidix to fill in tags etc. If transfer from English to Spanish reads a chunk like &quot;the children&quot;, the Spanish determiner needs to get the number and gender information from the &#039;&#039;target language&#039;&#039; noun. It is not enough to look at the output of the source language analyser, number can be changed by bidix for certain nouns, and gender is not even present in the source language. The transfer rule expects to have this information; without it, not only will the noun be output as @lemma, but the determiner will not be generated correctly either. This effect gets even worse with bigger chunks.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# Transfer rules quite often use target language information from bidix to fill in tags etc. If transfer from English to Spanish reads a chunk like &quot;the children&quot;, the Spanish determiner needs to get the number and gender information from the &#039;&#039;target language&#039;&#039; noun. It is not enough to look at the output of the source language analyser, number can be changed by bidix for certain nouns, and gender is not even present in the source language. The transfer rule expects to have this information; without it, not only will the noun be output as @lemma, but the determiner will not be generated correctly either. This effect gets even worse with bigger chunks.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;#* One might work around this by having exceptions in the transfer rules to e.g. guess number and gender if bidix doesn&#039;t give any, but this leads to an enormous increase in transfer complexity – all tags have to be presumed to be unknown, and developer time is wasted on bug-hunting and workarounds instead of improving translation quality.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;#* One might work around this by having exceptions in the transfer rules to e.g. guess number and gender if bidix doesn&#039;t give any, but this leads to an enormous increase in transfer complexity – all tags have to be presumed to be unknown, and developer time is wasted on bug-hunting and workarounds instead of improving translation quality.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;# Although there could be a technical solution to carrying over the source word if it&#039;s not in the bidix, this leads to problems with compounds and other multiwords that are split into two lexical units before bidix lookup: What do you do when part of a multiword is unknown? For example, if we have ^writes about/write&amp;amp;lt;vblex&amp;amp;gt;+about&amp;amp;lt;pr&amp;amp;gt;$, this is currently split before bidix lookup into two units ^write&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;$, without lemmas, and if only one is unknown after bidix lookup, the other will still translate: ^write&amp;amp;lt;vblex&amp;amp;gt;/escribir&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;/@about&amp;amp;lt;pr&amp;amp;gt;$. If, on the other hand, we were to keep the surface form around, we would also have keep it as one unit in bidix lookup, such that if parts of the multiword were unknown, all of it would be marked unknown, giving something like ^@writes about/write&amp;amp;lt;vblex&amp;amp;gt;+@about&amp;amp;lt;pr&amp;amp;gt;$. &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;(It&#039;s not possible to distribute the surface form over the two units; in this case, the split was at a space, but it could be anywhere, and the surface form gives no indication of where.)&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;# Although there could be a technical solution to carrying over the source word if it&#039;s not in the bidix, this leads to problems with compounds and other multiwords that are split into two lexical units before bidix lookup: What do you do when part of a multiword is unknown? For example, if we have ^writes about/write&amp;amp;lt;vblex&amp;amp;gt;+about&amp;amp;lt;pr&amp;amp;gt;$, this is currently split before bidix lookup into two units ^write&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;$, without lemmas, and if only one is unknown after bidix lookup, the other will still translate: ^write&amp;amp;lt;vblex&amp;amp;gt;/escribir&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;/@about&amp;amp;lt;pr&amp;amp;gt;$. If, on the other hand, we were to keep the surface form around, we would also have keep it as one unit in bidix lookup, such that if parts of the multiword were unknown, all of it would be marked unknown, giving something like ^@writes about/write&amp;amp;lt;vblex&amp;amp;gt;+@about&amp;amp;lt;pr&amp;amp;gt;$. &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;#* Can&#039;t you just distribute the surface form over the two units? ^writes/write&amp;amp;lt;vblex&amp;amp;gt;$ ^about/about&amp;amp;lt;pr&amp;amp;gt;$! While in this constructed example, the split was at a space, it could be anywhere. The surface form gives no general indication of where. We have multiwords that split in the middle of contractions (^au/à&amp;amp;lt;pr&amp;amp;gt;+le&amp;amp;lt;det&amp;amp;gt;&amp;amp;lt;def&amp;amp;lt;&amp;amp;gt;m&amp;amp;lt;&amp;amp;gt;sg&amp;amp;lt;$), or in the middle of compunds (^vasskokaren/vatn&amp;amp;lt;n&amp;amp;gt;+kokar&amp;amp;lt;n&amp;amp;gt;$)&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Unhammer</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Why_we_trim&amp;diff=36904&amp;oldid=prev</id>
		<title>Unhammer at 12:56, 16 October 2012</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Why_we_trim&amp;diff=36904&amp;oldid=prev"/>
		<updated>2012-10-16T12:56:43Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 12:56, 16 October 2012&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 6:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 6:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# Transfer rules quite often use target language information from bidix to fill in tags etc. If transfer from English to Spanish reads a chunk like &quot;the children&quot;, the Spanish determiner needs to get the number and gender information from the &#039;&#039;target language&#039;&#039; noun. It is not enough to look at the output of the source language analyser, number can be changed by bidix for certain nouns, and gender is not even present in the source language. The transfer rule expects to have this information; without it, not only will the noun be output as @lemma, but the determiner will not be generated correctly either. This effect gets even worse with bigger chunks.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# Transfer rules quite often use target language information from bidix to fill in tags etc. If transfer from English to Spanish reads a chunk like &quot;the children&quot;, the Spanish determiner needs to get the number and gender information from the &#039;&#039;target language&#039;&#039; noun. It is not enough to look at the output of the source language analyser, number can be changed by bidix for certain nouns, and gender is not even present in the source language. The transfer rule expects to have this information; without it, not only will the noun be output as @lemma, but the determiner will not be generated correctly either. This effect gets even worse with bigger chunks.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;#* One might work around this by having exceptions in the transfer rules to e.g. guess number and gender if bidix doesn&#039;t give any, but this leads to an enormous increase in transfer complexity – all tags have to be presumed to be unknown, and developer time is wasted on bug-hunting and workarounds instead of improving translation quality.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;#* One might work around this by having exceptions in the transfer rules to e.g. guess number and gender if bidix doesn&#039;t give any, but this leads to an enormous increase in transfer complexity – all tags have to be presumed to be unknown, and developer time is wasted on bug-hunting and workarounds instead of improving translation quality.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;# Although there could be a technical solution to carrying over the source word if it&#039;s not in the bidix, this leads to problems with compounds and other multiwords that are split into two lexical units before bidix lookup: What do you do when part of a multiword is unknown? For example, if we have ^writes about/write&amp;amp;lt;vblex&amp;amp;gt;+about&amp;amp;lt;pr&amp;amp;gt;$, this is currently split before bidix lookup into two units ^write&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;$, without lemmas, and if only one is unknown after bidix lookup, the other will still translate: ^write&amp;amp;lt;vblex&amp;amp;gt;/escribir&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;/@about&amp;amp;lt;pr&amp;amp;gt;$. If, on the other hand, we were to keep the surface form around, we would also have keep it as one unit in bidix lookup, such that if parts of the multiword were unknown, all of it would be marked unknown. (It&#039;s not possible to distribute the surface form over the two units; in this case, the split was at a space, but it could be anywhere, and the surface form gives no indication of where.)&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;# Although there could be a technical solution to carrying over the source word if it&#039;s not in the bidix, this leads to problems with compounds and other multiwords that are split into two lexical units before bidix lookup: What do you do when part of a multiword is unknown? For example, if we have ^writes about/write&amp;amp;lt;vblex&amp;amp;gt;+about&amp;amp;lt;pr&amp;amp;gt;$, this is currently split before bidix lookup into two units ^write&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;$, without lemmas, and if only one is unknown after bidix lookup, the other will still translate: ^write&amp;amp;lt;vblex&amp;amp;gt;/escribir&amp;amp;lt;vblex&amp;amp;gt;$ ^about&amp;amp;lt;pr&amp;amp;gt;/@about&amp;amp;lt;pr&amp;amp;gt;$. If, on the other hand, we were to keep the surface form around, we would also have keep it as one unit in bidix lookup, such that if parts of the multiword were unknown, all of it would be marked unknown&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;, giving something like ^@writes about/write&amp;amp;lt;vblex&amp;amp;gt;+@about&amp;amp;lt;pr&amp;amp;gt;$&lt;/ins&gt;. (It&#039;s not possible to distribute the surface form over the two units; in this case, the split was at a space, but it could be anywhere, and the surface form gives no indication of where.)&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Unhammer</name></author>
		
	</entry>
</feed>