<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.apertium.org/w/index.php?action=history&amp;feed=atom&amp;title=Talk%3AGoogle_Summer_of_Code%2FApplication_2019</id>
	<title>Talk:Google Summer of Code/Application 2019 - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.apertium.org/w/index.php?action=history&amp;feed=atom&amp;title=Talk%3AGoogle_Summer_of_Code%2FApplication_2019"/>
	<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Talk:Google_Summer_of_Code/Application_2019&amp;action=history"/>
	<updated>2026-05-09T08:20:10Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.34.1</generator>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Talk:Google_Summer_of_Code/Application_2019&amp;diff=68473&amp;oldid=prev</id>
		<title>Xavivars: /* GSoC proposals */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Talk:Google_Summer_of_Code/Application_2019&amp;diff=68473&amp;oldid=prev"/>
		<updated>2019-01-31T16:57:30Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;GSoC proposals&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 16:57, 31 January 2019&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 84:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 84:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# Update the functionality of the Java engine. Port a CG3 engine. Port (or adapt existing) HFST engine.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# Update the functionality of the Java engine. Port a CG3 engine. Port (or adapt existing) HFST engine.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]:  Disagree with this one. Perhaps I would rephrase it &quot;get pipeline for C++ to Android&quot; or &quot;get pipeline for generating from C++ to usage in OmegaT&quot;. I would like to add a section on integration/interoperability with other language data. Currently the UD project has freely-licensed annotated corpora for a good many languages in Apertium. Using them in Apertium is problematic because of different tokenisation standards and different tagsets. There should be some effort to make these more interoperable so that we can avoid duplicating effort (having people annotate English text).&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]:  Disagree with this one. Perhaps I would rephrase it &quot;get pipeline for C++ to Android&quot; or &quot;get pipeline for generating from C++ to usage in OmegaT&quot;. I would like to add a section on integration/interoperability with other language data. Currently the UD project has freely-licensed annotated corpora for a good many languages in Apertium. Using them in Apertium is problematic because of different tokenisation standards and different tagsets. There should be some effort to make these more interoperable so that we can avoid duplicating effort (having people annotate English text).&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;## [[User:Xavivars|Xavivars]] ([[User talk:Xavivars|talk]]) +1 &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;#&lt;/ins&gt;## [[User:Xavivars|Xavivars]] ([[User talk:Xavivars|talk]]) +1&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; to Fran&#039;s comment&lt;/ins&gt; &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# [[User:Francis Tyers|ftyers]]: &#039;&#039;&#039;MEETING PROPOSAL&#039;&#039;&#039;: If many people are going to be in Dublin this year for the [https://www.mtsummit2019.com/ MT Summit], how about having an Apertium meeting/workshop where we flesh this stuff out and try and gain some direction? It won&#039;t be in time for GSOC this year, but it&#039;s definitely nearly ten years since we had a real planning meeting. &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# [[User:Francis Tyers|ftyers]]: &#039;&#039;&#039;MEETING PROPOSAL&#039;&#039;&#039;: If many people are going to be in Dublin this year for the [https://www.mtsummit2019.com/ MT Summit], how about having an Apertium meeting/workshop where we flesh this stuff out and try and gain some direction? It won&#039;t be in time for GSOC this year, but it&#039;s definitely nearly ten years since we had a real planning meeting. &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;## --[[User:Mlforcada|Mlforcada]] ([[User talk:Mlforcada|talk]]) 11:32, 26 January 2019 (CET) This is really a cool idea. Let me see what we can do.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;## --[[User:Mlforcada|Mlforcada]] ([[User talk:Mlforcada|talk]]) 11:32, 26 January 2019 (CET) This is really a cool idea. Let me see what we can do.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Xavivars</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Talk:Google_Summer_of_Code/Application_2019&amp;diff=68445&amp;oldid=prev</id>
		<title>Xavivars: /* Discussion: Apertium strategy and the GSoC */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Talk:Google_Summer_of_Code/Application_2019&amp;diff=68445&amp;oldid=prev"/>
		<updated>2019-01-26T12:02:56Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Discussion: Apertium strategy and the GSoC&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 12:02, 26 January 2019&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 5:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 5:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;Does Apertium as a project need more direction? Where would Apertium stand in the [https://en.wikipedia.org/wiki/The_Cathedral_and_the_Bazaar cathedral–bazaar] line? Most work in Apertium is bazaar-like (people work on the languages and platform features they want) but we have some cathedral-style action such as this elected PMC. We also vote on GSoC projects, etc. But ---and this is mainly my fault as long-standing president--- we have not devoted enough effort to planning. One could say that Apertium basically flies on autopilot, and that, every now and then, some Apertiumers correct the course. I think we need to decide and clarify how to implement our mission (see below). And we need to give ourselves a method to do so.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;Does Apertium as a project need more direction? Where would Apertium stand in the [https://en.wikipedia.org/wiki/The_Cathedral_and_the_Bazaar cathedral–bazaar] line? Most work in Apertium is bazaar-like (people work on the languages and platform features they want) but we have some cathedral-style action such as this elected PMC. We also vote on GSoC projects, etc. But ---and this is mainly my fault as long-standing president--- we have not devoted enough effort to planning. One could say that Apertium basically flies on autopilot, and that, every now and then, some Apertiumers correct the course. I think we need to decide and clarify how to implement our mission (see below). And we need to give ourselves a method to do so.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;* [[User:Xavivars|Xavivars]] ([[User talk:Xavivars|talk]]) IMHO, having some project roadmap, regardless of GSoC, would be an extremely powerful tool to understand how we want to achieve our mission. Even if the mission of Apertium &quot;as an idea&quot; is clear, there hasn&#039;t been too much clarity on &quot;Apertium as a FOSS project&quot;. &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;** Do we want to grow Apertium by getting more contributors?&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;** If we had to choose, do we prefer a solid foundation (core modules easier to understand/contribute) or a wider coverage (more specific modules that do specific things)?&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;Desired impact, intended mission. We need to collectively decide what is important and concentrate our efforts there. Our declared mission is laid out in the bylaws as follows:&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;Desired impact, intended mission. We need to collectively decide what is important and concentrate our efforts there. Our declared mission is laid out in the bylaws as follows:&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 30:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 35:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# Languages that may be well supported by more than one MT vendor and have good or very good Apertium engines: Catalan–Spanish, Portuguese–Spanish, French–Spanish, etc. Their impact is large because they are widely used. Improving their quality may increase their impact and give Apertium some prestige.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# Languages that may be well supported by more than one MT vendor and have good or very good Apertium engines: Catalan–Spanish, Portuguese–Spanish, French–Spanish, etc. Their impact is large because they are widely used. Improving their quality may increase their impact and give Apertium some prestige.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: Agree, some of these don&#039;t even use the latest code from -lex-tools and -separable, and a substantial amount of work could be done to improve them based on these. Out of the three: Portuguese-Spanish = coverage is the main problem as far as I can tell, we have 15k entries, we need 50k or so, e.g. &amp;gt;95% coverage instead of &amp;gt;90%; French--Spanish = needs more rules, and much better disambiguation.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: Agree, some of these don&#039;t even use the latest code from -lex-tools and -separable, and a substantial amount of work could be done to improve them based on these. Out of the three: Portuguese-Spanish = coverage is the main problem as far as I can tell, we have 15k entries, we need 50k or so, e.g. &amp;gt;95% coverage instead of &amp;gt;90%; French--Spanish = needs more rules, and much better disambiguation.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;* [[User:Xavivars|Xavivars]] ([[User talk:Xavivars|talk]]) There has been a lot of effort recently in fra-cat (Hèctor) and spa-cat (Jaume). I&#039;m pretty sure a lot of rules could be reused for fra-spa.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 38:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 44:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: Here I disagree. I think that having a Java port is a dead end. It is inevitable that it will get out of date very quickly. I think that a better strategy is to (sentence incomplete in Fran&#039;s message).&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: Here I disagree. I think that having a Java port is a dead end. It is inevitable that it will get out of date very quickly. I think that a better strategy is to (sentence incomplete in Fran&#039;s message).&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;## [[User:Tino_Didriksen|Tino Didriksen]]: My very strong opinion about Java is to broadly speaking forget about Java. Maintaining Java ports is quite frankly a waste of time, and such ports would too quickly lag behind the C++ codebases. All the C++ code works cross-platform, also on Android - it just needs scripts to be built and bundled correctly, which is a mostly one-time setup task perfectly suitable for GSoC. What we may need is a JNI layer and better C++ APIs so that Java can call the native backends without spawning separate piped processes - this would also be a good GSoC project. An alternative if one really wants non-native libraries is something like Emscripten that can compile the C++ code to JavaScript, but that&#039;s just yet another build target similar to Android.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;## [[User:Tino_Didriksen|Tino Didriksen]]: My very strong opinion about Java is to broadly speaking forget about Java. Maintaining Java ports is quite frankly a waste of time, and such ports would too quickly lag behind the C++ codebases. All the C++ code works cross-platform, also on Android - it just needs scripts to be built and bundled correctly, which is a mostly one-time setup task perfectly suitable for GSoC. What we may need is a JNI layer and better C++ APIs so that Java can call the native backends without spawning separate piped processes - this would also be a good GSoC project. An alternative if one really wants non-native libraries is something like Emscripten that can compile the C++ code to JavaScript, but that&#039;s just yet another build target similar to Android.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;##  [[User:Xavivars|Xavivars]] ([[User talk:Xavivars|talk]]) I agree with Fran and Tino: having two implementations of Apertium doesn&#039;t scale in general, and even less in a project like ours. There is simply not enough people writing code to keep both versions in sync. It&#039;s true that we&#039;ve been pushing the JAVA port up to speed via GSoCs, but most of the recent modules (apertium-separable, apertium-lex-tools) or other tools broadly used in pairs (cg) are not available in the Java port. And if we had to focus on something, I&#039;d rather spend the time fixing the known bugs of the C++ codebase than reimplementing what we already have in C++ into Java. That said, I personally like better the Java codebase, the C++ code is quite difficult to understand and contribute to, even for someone relatively familiar to the project, and who writes code for a living.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# improve the CG3 processor (it is too slow now, I gather, and is heavily used in many languages).&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# improve the CG3 processor (it is too slow now, I gather, and is heavily used in many languages).&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: About CG3, it is rarely the bottleneck until you get up to thousands of rules. The transfer component is usually a bigger bottleneck, although I would welcome hard facts about that. In any case, I think Apertium is reasonably fast, most of the time.---Improvements are always possible though.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: About CG3, it is rarely the bottleneck until you get up to thousands of rules. The transfer component is usually a bigger bottleneck, although I would welcome hard facts about that. In any case, I think Apertium is reasonably fast, most of the time.---Improvements are always possible though.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;### [[User:Xavivars|Xavivars]] ([[User talk:Xavivars|talk]]) If I&#039;m not wrong, Java transfer was faster than the C++ one. Could we maybe come up with a solution to leverage the ideas (not the implementation) of it? Moving from &quot;interpreted&quot; transfer rules to a &quot;compiled&quot; version of them (like we do for all dictionaries)?&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;## [[User:Tino_Didriksen|Tino Didriksen]]: it&#039;s been tried and mostly failed. To get truly explosive performance boosts, the corresponding limitations in rule number and rule complexity one has to live with are stifling. But I&#039;m always open for new ideas, and I hope to be proven wrong - while I don&#039;t think CG-3 is too slow, faster is better.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;## [[User:Tino_Didriksen|Tino Didriksen]]: it&#039;s been tried and mostly failed. To get truly explosive performance boosts, the corresponding limitations in rule number and rule complexity one has to live with are stifling. But I&#039;m always open for new ideas, and I hope to be proven wrong - while I don&#039;t think CG-3 is too slow, faster is better.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;## [[User:Xavivars|Xavivars]] ([[User talk:Xavivars|talk]]) I agree CG-3 is quite slow, and in some languages where the disambiguation has improved a lot via rules (i.e. Catalan) it can become the bottleneck pretty easily. But, even if it&#039;s the bottleneck, and an improvement in there would be helpful, Apertium is still quite fast.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]:For me, there are a number of things that should be improved in the engine.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]:For me, there are a number of things that should be improved in the engine.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;### The first is that each component should have the potential to be weighted, and these weights should be able to be learnt jointly. This is a big project   probably at least a PhD amount of work, but has the potential to offer very big improvements.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;### The first is that each component should have the potential to be weighted, and these weights should be able to be learnt jointly. This is a big project   probably at least a PhD amount of work, but has the potential to offer very big improvements.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 46:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 55:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;### The format handling should be reworked, we really need it and it&#039;s so close. But unfortunately not something that could be done in a GSOC project by a  new Apertiumer.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;### The format handling should be reworked, we really need it and it&#039;s so close. But unfortunately not something that could be done in a GSOC project by a  new Apertiumer.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;### More expressive, and more linguist-friendly transfer formalism. The problem is that this is a non-trivial task, and one that is also hard to find money   for, when I mentioned it to someone recently, they said &quot;that sounds like something the EU would have funded in the early-2000s&quot;.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;### More expressive, and more linguist-friendly transfer formalism. The problem is that this is a non-trivial task, and one that is also hard to find money   for, when I mentioned it to someone recently, they said &quot;that sounds like something the EU would have funded in the early-2000s&quot;.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;### An interesting research idea that people have discussed with me would&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;  &lt;/del&gt; be to have Apertium be able to generate data for neural MT, but with a   more expressive formalism. We have done some work on this already, but not with a very generative system. I have some grant ideas about that,   e.g. from field linguistics -&amp;gt; NMT, if can expound on them if people are interested. &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;### An interesting research idea that people have discussed with me would be to have Apertium be able to generate data for neural MT, but with a   more expressive formalism. We have done some work on this already, but not with a very generative system. I have some grant ideas about that,   e.g. from field linguistics -&amp;gt; NMT, if can expound on them if people are interested. &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;### [[User:Xavivars|Xavivars]] ([[User talk:Xavivars|talk]]) As I mentioned before, the codebase as a whole is a mess. And one of the reasons is how it has been evolving. From a proof-of-concept used for some specific research, something has become a &quot;production&quot; module. And it&#039;s amazing to get new modules implementing new features like that. But we could do way more with a cleaner codebase, with more automated tests that prevented regressions, etc. Inverting on quality of the existing codebase I think would pay back in the future.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;=== Evaluation ===&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;=== Evaluation ===&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 74:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 84:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# Update the functionality of the Java engine. Port a CG3 engine. Port (or adapt existing) HFST engine.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# Update the functionality of the Java engine. Port a CG3 engine. Port (or adapt existing) HFST engine.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]:  Disagree with this one. Perhaps I would rephrase it &quot;get pipeline for C++ to Android&quot; or &quot;get pipeline for generating from C++ to usage in OmegaT&quot;. I would like to add a section on integration/interoperability with other language data. Currently the UD project has freely-licensed annotated corpora for a good many languages in Apertium. Using them in Apertium is problematic because of different tokenisation standards and different tagsets. There should be some effort to make these more interoperable so that we can avoid duplicating effort (having people annotate English text).&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]:  Disagree with this one. Perhaps I would rephrase it &quot;get pipeline for C++ to Android&quot; or &quot;get pipeline for generating from C++ to usage in OmegaT&quot;. I would like to add a section on integration/interoperability with other language data. Currently the UD project has freely-licensed annotated corpora for a good many languages in Apertium. Using them in Apertium is problematic because of different tokenisation standards and different tagsets. There should be some effort to make these more interoperable so that we can avoid duplicating effort (having people annotate English text).&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;## [[User:Xavivars|Xavivars]] ([[User talk:Xavivars|talk]]) +1 &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-added&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-added&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# [[User:Francis Tyers|ftyers]]: &#039;&#039;&#039;MEETING PROPOSAL&#039;&#039;&#039;: If many people are going to be in Dublin this year for the [https://www.mtsummit2019.com/ MT Summit], how about having an Apertium meeting/workshop where we flesh this stuff out and try and gain some direction? It won&#039;t be in time for GSOC this year, but it&#039;s definitely nearly ten years since we had a real planning meeting. &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# [[User:Francis Tyers|ftyers]]: &#039;&#039;&#039;MEETING PROPOSAL&#039;&#039;&#039;: If many people are going to be in Dublin this year for the [https://www.mtsummit2019.com/ MT Summit], how about having an Apertium meeting/workshop where we flesh this stuff out and try and gain some direction? It won&#039;t be in time for GSOC this year, but it&#039;s definitely nearly ten years since we had a real planning meeting. &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;## --[[User:Mlforcada|Mlforcada]] ([[User talk:Mlforcada|talk]]) 11:32, 26 January 2019 (CET) This is really a cool idea. Let me see what we can do.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;## --[[User:Mlforcada|Mlforcada]] ([[User talk:Mlforcada|talk]]) 11:32, 26 January 2019 (CET) This is really a cool idea. Let me see what we can do.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Xavivars</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Talk:Google_Summer_of_Code/Application_2019&amp;diff=68441&amp;oldid=prev</id>
		<title>Xavivars: /* Discussion: Apertium strategy and the GSoC */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Talk:Google_Summer_of_Code/Application_2019&amp;diff=68441&amp;oldid=prev"/>
		<updated>2019-01-26T11:39:36Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Discussion: Apertium strategy and the GSoC&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 11:39, 26 January 2019&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 18:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 18:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;We have some interesting data collected in our wiki, but I think we need to evaluate impact better. We would need a more complete survey about where Apertium is used (in research, commercially, etc.). I know GSoC does not pay for this, but we do have some money in our kitty (€35,000 or more).&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;We have some interesting data collected in our wiki, but I think we need to evaluate impact better. We would need a more complete survey about where Apertium is used (in research, commercially, etc.). I know GSoC does not pay for this, but we do have some money in our kitty (€35,000 or more).&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;=== Languages ===&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;&#039;&#039;&#039;Languages&#039;&#039;&#039;. &lt;/del&gt;I believe we need to be more specific, particularly about languages. Our mission does not say much about where to concentrate our effort. Let me give some ideas about different kinds of languages (the list is not exhaustive).&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;I believe we need to be more specific, particularly about languages. Our mission does not say much about where to concentrate our effort. Let me give some ideas about different kinds of languages (the list is not exhaustive).&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# Languages that do not have (other) machine translation, free/open-source or otherwise, and there is little chance that they will be tackled with the new neural approach, as the parallel corpora available are too small. Nice examples are Occitan, Sardinian or Breton. We do have some Apertium for these languages, but they are different.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# Languages that do not have (other) machine translation, free/open-source or otherwise, and there is little chance that they will be tackled with the new neural approach, as the parallel corpora available are too small. Nice examples are Occitan, Sardinian or Breton. We do have some Apertium for these languages, but they are different.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 31:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 32:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;=== Engine ===&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;a class=&quot;mw-diff-movedpara-left&quot; title=&quot;Paragraph was moved. Click to jump to new location.&quot; href=&quot;#movedpara_7_1_rhs&quot;&gt;&amp;#x26AB;&lt;/a&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;&lt;a name=&quot;movedpara_6_0_lhs&quot;&gt;&lt;/a&gt;&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;&#039;&#039;&#039;Engine&#039;&#039;&#039;: &lt;/del&gt;I will not be too specific here as I am not that familiar and I may be wrong. For instance we could&lt;/div&gt;&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-added&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;a class=&quot;mw-diff-movedpara-right&quot; title=&quot;Paragraph was moved. Click to jump to old location.&quot; href=&quot;#movedpara_6_0_lhs&quot;&gt;&amp;#x26AB;&lt;/a&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;&lt;a name=&quot;movedpara_7_1_rhs&quot;&gt;&lt;/a&gt;I will not be too specific here as I am not that familiar and I may be wrong. For instance we could&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# improve the Java engine and, consequently, the Android app (remember it can be used offline, as in the [https://translatorswithoutborders.org/translators-without-borders-develops-worlds-first-crisis-specific-machine-translation-system-kurdish-languages/ Kurdish language developed by Translators without Borders]) and the [http://wiki.apertium.org/wiki/Apertium-OmegaT OmegaT plugin]&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# improve the Java engine and, consequently, the Android app (remember it can be used offline, as in the [https://translatorswithoutborders.org/translators-without-borders-develops-worlds-first-crisis-specific-machine-translation-system-kurdish-languages/ Kurdish language developed by Translators without Borders]) and the [http://wiki.apertium.org/wiki/Apertium-OmegaT OmegaT plugin]&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: Here I disagree. I think that having a Java port is a dead end. It is inevitable that it will get out of date very quickly. I think that a better strategy is to (sentence incomplete in Fran&#039;s message).&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: Here I disagree. I think that having a Java port is a dead end. It is inevitable that it will get out of date very quickly. I think that a better strategy is to (sentence incomplete in Fran&#039;s message).&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 45:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 48:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;### An interesting research idea that people have discussed with me would   be to have Apertium be able to generate data for neural MT, but with a   more expressive formalism. We have done some work on this already, but not with a very generative system. I have some grant ideas about that,   e.g. from field linguistics -&amp;gt; NMT, if can expound on them if people are interested. &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;### An interesting research idea that people have discussed with me would   be to have Apertium be able to generate data for neural MT, but with a   more expressive formalism. We have done some work on this already, but not with a very generative system. I have some grant ideas about that,   e.g. from field linguistics -&amp;gt; NMT, if can expound on them if people are interested. &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;=== Evaluation ===&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;a class=&quot;mw-diff-movedpara-left&quot; title=&quot;Paragraph was moved. Click to jump to new location.&quot; href=&quot;#movedpara_11_1_rhs&quot;&gt;&amp;#x26AB;&lt;/a&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;&lt;a name=&quot;movedpara_10_0_lhs&quot;&gt;&lt;/a&gt;&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;&#039;&#039;&#039;Evaluation&#039;&#039;&#039;: we&lt;/del&gt; have some [http://wiki.apertium.org/wiki/Evaluation automatic evaluation] (word-error rate, naïve coverage) in our wiki but we should select some language pairs for evaluation. Gap-filling (see [https://arxiv.org/abs/1809.00315 this recent paper] and references therein) could be used when we expect a langauge pair to be used for gisting. For language pairs which may be used for dissemination, we could pay professional post-editors. The Universitat d&#039;Alacant has experience on this (we had four people post-editing for a week, and we spent about €3000). Tools like [https://github.com/ghpaetzold/PET PET] could be used. But there are also some unexplored usages of Apertium such as [https://github.com/transducens/Forecat-OmegaT  interactive machine translation] (see also [https://ufal.mff.cuni.cz/pbml/102/art-torregrosa-forcada-perez-ortiz.pdf this paper]). &lt;/div&gt;&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-added&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;a class=&quot;mw-diff-movedpara-right&quot; title=&quot;Paragraph was moved. Click to jump to old location.&quot; href=&quot;#movedpara_10_0_lhs&quot;&gt;&amp;#x26AB;&lt;/a&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;&lt;a name=&quot;movedpara_11_1_rhs&quot;&gt;&lt;/a&gt;&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;We&lt;/ins&gt; have some [http://wiki.apertium.org/wiki/Evaluation automatic evaluation] (word-error rate, naïve coverage) in our wiki but we should select some language pairs for evaluation. Gap-filling (see [https://arxiv.org/abs/1809.00315 this recent paper] and references therein) could be used when we expect a langauge pair to be used for gisting. For language pairs which may be used for dissemination, we could pay professional post-editors. The Universitat d&#039;Alacant has experience on this (we had four people post-editing for a week, and we spent about €3000). Tools like [https://github.com/ghpaetzold/PET PET] could be used. But there are also some unexplored usages of Apertium such as [https://github.com/transducens/Forecat-OmegaT  interactive machine translation] (see also [https://ufal.mff.cuni.cz/pbml/102/art-torregrosa-forcada-perez-ortiz.pdf this paper]). &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;Things we can spend our money on:&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;Things we can spend our money on:&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 58:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 63:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;### Classification of translation errors by module &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;### Classification of translation errors by module &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;=== GSoC proposals ===&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;a class=&quot;mw-diff-movedpara-left&quot; title=&quot;Paragraph was moved. Click to jump to new location.&quot; href=&quot;#movedpara_15_1_rhs&quot;&gt;&amp;#x26AB;&lt;/a&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;&lt;a name=&quot;movedpara_14_0_lhs&quot;&gt;&lt;/a&gt;&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;&#039;&#039;&#039;GSoC proposals&#039;&#039;&#039;. &lt;/del&gt;Very specific proposals that should be part of GSoC 2019 if we can find interested people (&quot;Mikel&#039;s list&quot;), to be fleshed out.&lt;/div&gt;&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-added&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;a class=&quot;mw-diff-movedpara-right&quot; title=&quot;Paragraph was moved. Click to jump to old location.&quot; href=&quot;#movedpara_14_0_lhs&quot;&gt;&amp;#x26AB;&lt;/a&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;&lt;a name=&quot;movedpara_15_1_rhs&quot;&gt;&lt;/a&gt;Very specific proposals that should be part of GSoC 2019 if we can find interested people (&quot;Mikel&#039;s list&quot;), to be fleshed out.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# Improve the quality of Occitan pairs. Contact Occitanists so that they get involved and spread the word.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# Improve the quality of Occitan pairs. Contact Occitanists so that they get involved and spread the word.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Xavivars</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Talk:Google_Summer_of_Code/Application_2019&amp;diff=68440&amp;oldid=prev</id>
		<title>Mlforcada: /* Discussion: Apertium strategy and the GSoC */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Talk:Google_Summer_of_Code/Application_2019&amp;diff=68440&amp;oldid=prev"/>
		<updated>2019-01-26T10:40:39Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Discussion: Apertium strategy and the GSoC&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 10:40, 26 January 2019&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 69:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 69:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;# [[User:Francis Tyers|ftyers]]: &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;*&lt;/del&gt;MEETING PROPOSAL&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;*&lt;/del&gt;: If many people are going to be in Dublin this year for the [https://www.mtsummit2019.com/ MT Summit], how about having an Apertium meeting/workshop where we flesh this stuff out and try and gain some direction? It won&#039;t be in time for GSOC this year, but it&#039;s definitely nearly ten years since we had a real planning meeting. &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;# [[User:Francis Tyers|ftyers]]: &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&#039;&#039;&#039;&lt;/ins&gt;MEETING PROPOSAL&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;&#039;&#039;&#039;&lt;/ins&gt;: If many people are going to be in Dublin this year for the [https://www.mtsummit2019.com/ MT Summit], how about having an Apertium meeting/workshop where we flesh this stuff out and try and gain some direction? It won&#039;t be in time for GSOC this year, but it&#039;s definitely nearly ten years since we had a real planning meeting. &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;## --[[User:Mlforcada|Mlforcada]] ([[User talk:Mlforcada|talk]]) 11:32, 26 January 2019 (CET) This is really a cool idea. Let me see what we can do.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;## --[[User:Mlforcada|Mlforcada]] ([[User talk:Mlforcada|talk]]) 11:32, 26 January 2019 (CET) This is really a cool idea. Let me see what we can do.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Mlforcada</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Talk:Google_Summer_of_Code/Application_2019&amp;diff=68439&amp;oldid=prev</id>
		<title>Mlforcada: /* Discussion: Apertium strategy and the GSoC */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Talk:Google_Summer_of_Code/Application_2019&amp;diff=68439&amp;oldid=prev"/>
		<updated>2019-01-26T10:40:05Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Discussion: Apertium strategy and the GSoC&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 10:40, 26 January 2019&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 28:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 28:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: Tentatively agree, but we should do a careful study of where the best impact can be found before doing e.g. Swahili--English. &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: Tentatively agree, but we should do a careful study of where the best impact can be found before doing e.g. Swahili--English. &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# Languages that may be well supported by more than one MT vendor and have good or very good Apertium engines: Catalan–Spanish, Portuguese–Spanish, French–Spanish, etc. Their impact is large because they are widely used. Improving their quality may increase their impact and give Apertium some prestige.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# Languages that may be well supported by more than one MT vendor and have good or very good Apertium engines: Catalan–Spanish, Portuguese–Spanish, French–Spanish, etc. Their impact is large because they are widely used. Improving their quality may increase their impact and give Apertium some prestige.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: Agree, some of these don&#039;t even use the latest code from -lex-tools and -separable, and a substantial amount of work could be done to&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: Agree, some of these don&#039;t even use the latest code from -lex-tools and -separable, and a substantial amount of work could be done to&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; improve them based on these. Out of the three: Portuguese-Spanish = coverage is the main problem as far as I can tell, we have 15k entries, we need 50k or so, e.g. &amp;gt;95% coverage instead of &amp;gt;90%; French--Spanish = needs more rules, and much better disambiguation.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;improve them based on these. Out of the three: Portuguese-Spanish = coverage is the main problem as far as I can tell, we have 15k entries, we need 50k or so, e.g. &amp;gt;95% coverage instead of &amp;gt;90%; French--Spanish = needs more rules, and much better disambiguation.&lt;/div&gt;&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-added&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Mlforcada</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Talk:Google_Summer_of_Code/Application_2019&amp;diff=68438&amp;oldid=prev</id>
		<title>Mlforcada: /* Discussion: Apertium strategy and the GSoC */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Talk:Google_Summer_of_Code/Application_2019&amp;diff=68438&amp;oldid=prev"/>
		<updated>2019-01-26T10:39:42Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Discussion: Apertium strategy and the GSoC&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 10:39, 26 January 2019&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 32:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 32:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;&#039;&#039;&#039;Engine&#039;&#039;&#039;: I will not be too specific here as I am not that familiar and I may be wrong. For instance&lt;del class=&quot;diffchange diffchange-inline&quot;&gt;,&lt;/del&gt; &lt;del class=&quot;diffchange diffchange-inline&quot;&gt;one&lt;/del&gt; &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;&#039;&#039;&#039;Engine&#039;&#039;&#039;: I will not be too specific here as I am not that familiar and I may be wrong. For instance &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;we&lt;/ins&gt; &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;could&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-added&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# improve the Java engine and, consequently, the Android app (remember it can be used offline, as in the [https://translatorswithoutborders.org/translators-without-borders-develops-worlds-first-crisis-specific-machine-translation-system-kurdish-languages/ Kurdish language developed by Translators without Borders]) and the [http://wiki.apertium.org/wiki/Apertium-OmegaT OmegaT plugin]&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# improve the Java engine and, consequently, the Android app (remember it can be used offline, as in the [https://translatorswithoutborders.org/translators-without-borders-develops-worlds-first-crisis-specific-machine-translation-system-kurdish-languages/ Kurdish language developed by Translators without Borders]) and the [http://wiki.apertium.org/wiki/Apertium-OmegaT OmegaT plugin]&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: Here I disagree. I think that having a Java port is a dead end. It is inevitable that it will get out of date very quickly. I think that a better strategy is to (sentence incomplete in Fran&#039;s message).&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: Here I disagree. I think that having a Java port is a dead end. It is inevitable that it will get out of date very quickly. I think that a better strategy is to (sentence incomplete in Fran&#039;s message).&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;## [[User:Tino_Didriksen|Tino Didriksen]]: My very strong opinion about Java is to broadly speaking forget about Java. Maintaining Java ports is quite frankly a waste of time, and such ports would too quickly lag behind the C++ codebases. All the C++ code works cross-platform, also on Android - it just needs scripts to be built and bundled correctly, which is a mostly one-time setup task perfectly suitable for GSoC. What we may need is a JNI layer and better C++ APIs so that Java can call the native backends without spawning separate piped processes - this would also be a good GSoC project. An alternative if one really wants non-native libraries is something like Emscripten that can compile the C++ code to JavaScript, but that&#039;s just yet another build target similar to Android.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# improve the CG3 processor (it is too slow now, I gather, and is heavily used in many languages).&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# improve the CG3 processor (it is too slow now, I gather, and is heavily used in many languages).&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: About CG3, it is rarely the bottleneck until you get up to thousands of rules. The transfer component is usually a bigger bottleneck, although I would welcome hard facts about that. In any case, I think Apertium is reasonably fast, most of the time.---Improvements are always possible though.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: About CG3, it is rarely the bottleneck until you get up to thousands of rules. The transfer component is usually a bigger bottleneck, although I would welcome hard facts about that. In any case, I think Apertium is reasonably fast, most of the time.---Improvements are always possible though.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;## [[User:Tino_Didriksen|Tino Didriksen]]: it&#039;s been tried and mostly failed. To get truly explosive performance boosts, the corresponding limitations in rule number and rule complexity one has to live with are stifling. But I&#039;m always open for new ideas, and I hope to be proven wrong - while I don&#039;t think CG-3 is too slow, faster is better.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]:For me, there are a number of things that should be improved in the engine.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]:For me, there are a number of things that should be improved in the engine.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;### The first is that each component should have the potential to be weighted, and these weights should be able to be learnt jointly. This is a big project   probably at least a PhD amount of work, but has the potential to offer very big improvements.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;### The first is that each component should have the potential to be weighted, and these weights should be able to be learnt jointly. This is a big project   probably at least a PhD amount of work, but has the potential to offer very big improvements.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Mlforcada</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Talk:Google_Summer_of_Code/Application_2019&amp;diff=68437&amp;oldid=prev</id>
		<title>Mlforcada: /* Discussion: Apertium strategy and the GSoC */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Talk:Google_Summer_of_Code/Application_2019&amp;diff=68437&amp;oldid=prev"/>
		<updated>2019-01-26T10:36:56Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Discussion: Apertium strategy and the GSoC&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 10:36, 26 January 2019&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 35:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 35:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# improve the Java engine and, consequently, the Android app (remember it can be used offline, as in the [https://translatorswithoutborders.org/translators-without-borders-develops-worlds-first-crisis-specific-machine-translation-system-kurdish-languages/ Kurdish language developed by Translators without Borders]) and the [http://wiki.apertium.org/wiki/Apertium-OmegaT OmegaT plugin]&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# improve the Java engine and, consequently, the Android app (remember it can be used offline, as in the [https://translatorswithoutborders.org/translators-without-borders-develops-worlds-first-crisis-specific-machine-translation-system-kurdish-languages/ Kurdish language developed by Translators without Borders]) and the [http://wiki.apertium.org/wiki/Apertium-OmegaT OmegaT plugin]&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: Here I disagree. I think that having a Java port is a dead end. It is&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: Here I disagree. I think that having a Java port is a dead end. It is&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; inevitable that it will get out of date very quickly. I think that a better strategy is to (sentence incomplete in Fran&#039;s message).&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;inevitable that it will get out of date very quickly. I think that a better strategy is to (sentence incomplete in Fran&#039;s message).&lt;/div&gt;&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-added&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# improve the CG3 processor (it is too slow now, I gather, and is heavily used in many languages).&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# improve the CG3 processor (it is too slow now, I gather, and is heavily used in many languages).&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: About CG3, it is rarely the bottleneck until you get up to thousands of rules. The transfer component is usually a bigger bottleneck, although I would welcome hard facts about that. In any case, I think Apertium is reasonably fast, most of the time.---Improvements are always possible though.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: About CG3, it is rarely the bottleneck until you get up to thousands of rules. The transfer component is usually a bigger bottleneck, although I would welcome hard facts about that. In any case, I think Apertium is reasonably fast, most of the time.---Improvements are always possible though.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Mlforcada</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Talk:Google_Summer_of_Code/Application_2019&amp;diff=68436&amp;oldid=prev</id>
		<title>Mlforcada: /* Discussion: Apertium strategy and the GSoC */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Talk:Google_Summer_of_Code/Application_2019&amp;diff=68436&amp;oldid=prev"/>
		<updated>2019-01-26T10:36:34Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Discussion: Apertium strategy and the GSoC&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 10:36, 26 January 2019&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 26:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 26:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;### [[User:Francis Tyers|ftyers]]: The gisting case is also useful yes.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;### [[User:Francis Tyers|ftyers]]: The gisting case is also useful yes.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# Languages that do have commercial machine translation available but are important enough to be tackled by Apertium. I am thinking about languages such as Swahili, Hausa, Gujarati, Igbo, or Yoruba (the Universitat d&#039;Alacant is currently involved in a project called GoURMeT that will build MT for some of these languages. Neural will be hard, as corpora are small. There could be some interesting sinergies). These languages are rather large and served by Google and other MT providers but do not have free/open-source MT. Apertium, as a mission, has to &quot;give everyone free, unlimited access to the best possible machine-translation technologies&quot;. When the only MT available for a language are neural or statistical black boxes trained on corpora which are not publicly available, then their language communities are disempowered and vendor-locked. We need to generate free/open-source technology for them, perhaps for gisting purposes.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# Languages that do have commercial machine translation available but are important enough to be tackled by Apertium. I am thinking about languages such as Swahili, Hausa, Gujarati, Igbo, or Yoruba (the Universitat d&#039;Alacant is currently involved in a project called GoURMeT that will build MT for some of these languages. Neural will be hard, as corpora are small. There could be some interesting sinergies). These languages are rather large and served by Google and other MT providers but do not have free/open-source MT. Apertium, as a mission, has to &quot;give everyone free, unlimited access to the best possible machine-translation technologies&quot;. When the only MT available for a language are neural or statistical black boxes trained on corpora which are not publicly available, then their language communities are disempowered and vendor-locked. We need to generate free/open-source technology for them, perhaps for gisting purposes.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: Tentatively agree, but we should do a careful study of where the best impact can&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: Tentatively agree, but we should do a careful study of where the best impact can&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; be found before doing e.g. Swahili--English. &lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;be found before doing e.g. Swahili--English. &lt;/div&gt;&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-added&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# Languages that may be well supported by more than one MT vendor and have good or very good Apertium engines: Catalan–Spanish, Portuguese–Spanish, French–Spanish, etc. Their impact is large because they are widely used. Improving their quality may increase their impact and give Apertium some prestige.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# Languages that may be well supported by more than one MT vendor and have good or very good Apertium engines: Catalan–Spanish, Portuguese–Spanish, French–Spanish, etc. Their impact is large because they are widely used. Improving their quality may increase their impact and give Apertium some prestige.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: Agree, some of these don&#039;t even use the latest code from -lex-tools and -separable, and a substantial amount of work could be done to&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: Agree, some of these don&#039;t even use the latest code from -lex-tools and -separable, and a substantial amount of work could be done to&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Mlforcada</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Talk:Google_Summer_of_Code/Application_2019&amp;diff=68435&amp;oldid=prev</id>
		<title>Mlforcada: /* Discussion: Apertium strategy and the GSoC */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Talk:Google_Summer_of_Code/Application_2019&amp;diff=68435&amp;oldid=prev"/>
		<updated>2019-01-26T10:36:12Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Discussion: Apertium strategy and the GSoC&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 10:36, 26 January 2019&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 29:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 29:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;be found before doing e.g. Swahili--English. &lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;be found before doing e.g. Swahili--English. &lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;# Languages that may be well supported by more than one MT vendor and have good or very good Apertium engines: Catalan–Spanish, Portuguese–Spanish, French–Spanish, etc. Their impact is large because they are widely used. Improving their quality may increase their impact and give Apertium some prestige.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;# Languages that may be well supported by more than one MT vendor and have good or very good Apertium engines: Catalan–Spanish, Portuguese–Spanish, French–Spanish, etc. Their impact is large because they are widely used. Improving their quality may increase their impact and give Apertium some prestige.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: Agree, some of these don&#039;t even use the latest code from -lex-tools&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;## [[User:Francis Tyers|ftyers]]: Agree, some of these don&#039;t even use the latest code from -lex-tools&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; and -separable, and a substantial amount of work could be done to&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;and -separable, and a substantial amount of work could be done to&lt;/div&gt;&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-added&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;improve them based on these. Out of the three: Portuguese-Spanish = coverage is the main problem as far as I can tell, we have 15k entries, we need 50k or so, e.g. &amp;gt;95% coverage instead of &amp;gt;90%; French--Spanish = needs more rules, and much better disambiguation.&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;improve them based on these. Out of the three: Portuguese-Spanish = coverage is the main problem as far as I can tell, we have 15k entries, we need 50k or so, e.g. &amp;gt;95% coverage instead of &amp;gt;90%; French--Spanish = needs more rules, and much better disambiguation.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Mlforcada</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Talk:Google_Summer_of_Code/Application_2019&amp;diff=68434&amp;oldid=prev</id>
		<title>Mlforcada: /* Discussion: Apertium strategy and the GSoC */</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Talk:Google_Summer_of_Code/Application_2019&amp;diff=68434&amp;oldid=prev"/>
		<updated>2019-01-26T10:35:46Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Discussion: Apertium strategy and the GSoC&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 10:35, 26 January 2019&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 72:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 72:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;# [[User:Francis Tyers|ftyers]]: *MEETING PROPOSAL*: If many people are going to be in Dublin this year for MT&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;# [[User:Francis Tyers|ftyers]]: *MEETING PROPOSAL*: If many people are going to be in Dublin this year for&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; the [https://www.mtsummit2019.com/&lt;/ins&gt; MT&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt; Summit], how about having an Apertium meeting/workshop where we flesh this stuff out and try and gain some direction? It won&#039;t be in time for GSOC this year, but it&#039;s definitely nearly ten years since we had a real planning meeting. &lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;a class=&quot;mw-diff-movedpara-right&quot; title=&quot;Paragraph was moved. Click to jump to old location.&quot; href=&quot;#movedpara_4_0_lhs&quot;&gt;&amp;#x26AB;&lt;/a&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;&lt;a name=&quot;movedpara_2_0_rhs&quot;&gt;&lt;/a&gt;## --[[User:Mlforcada|Mlforcada]] ([[User talk:Mlforcada|talk]]) 11:32, 26 January 2019 (CET) This is really a cool idea&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;. Let me see what we can do&lt;/ins&gt;.&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;Summit, how about having an Apertium meeting/workshop where we flesh this stuff out and try and gain some direction? It won&#039;t be in time for GSOC this year, but it&#039;s definitely nearly ten years since we had a real planning meeting. &lt;/div&gt;&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-added&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;a class=&quot;mw-diff-movedpara-left&quot; title=&quot;Paragraph was moved. Click to jump to new location.&quot; href=&quot;#movedpara_2_0_rhs&quot;&gt;&amp;#x26AB;&lt;/a&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;div&gt;&lt;a name=&quot;movedpara_4_0_lhs&quot;&gt;&lt;/a&gt;## --[[User:Mlforcada|Mlforcada]] ([[User talk:Mlforcada|talk]]) 11:32, 26 January 2019 (CET) This is really a cool idea.&lt;/div&gt;&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-added&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Mlforcada</name></author>
		
	</entry>
</feed>