Difference between revisions of "User:Khannatanmai/Wordbound blanks"

From Apertium
Jump to navigation Jump to search
Line 56: Line 56:
 
</pre>
 
</pre>
   
  +
From [[https://phabricator.wikimedia.org/diffusion/GCXS/browse/master/test/mt/Apertium.test.js|wikimedia_tests]]
  +
<pre>
  +
source: '<p>A <b>Japanese</b> <i>BBC</i> article</p>',
  +
target: '<p>Un artículo de <i>BBC</i> <b>japonés</b></p>',
  +
  +
source: '<div>A <b>modern</b> Britain.</div>',
  +
target: '<div>Una Gran Bretaña <b>moderna</b>.</div>',
  +
  +
source: '<p>The <b>big <i>red</i></b> dog</p>',
  +
target: '<p>El perro <b><i>rojo</i></b> <b>grande</b></p>',
  +
  +
source: '<p>He said "<i>I tile <a href="x">bathrooms</a>.</i>"</p>',
  +
target: '<p>Diga que "<i>enladrillo</i> <i><a href="x">baños</a></i>."</p>',
  +
  +
source: '<p>The <b>big red</b> dog</p>',
  +
target: '<p>El perro <b>rojo grande</b></p>',
  +
  +
source: '<p>The <b>big</b> <b>red</b> dog</p>',
  +
target: '<p>El perro <b>rojo</b> <b>grande</b></p>',
  +
  +
source: '<p>The <a href="1">big</a> <a href="2">red</a> dog</p>',
  +
target: '<p>El perro <a href="2">rojo</a> <a href="1">grande</a></p>',
  +
  +
source: '<p id="8"><span class="cx-segment" data-segmentid="9"><a class="cx-link" data-linkid="17" href="./The_New_York_Times" rel="mw:WikiLink" title="The New York Times">The New York Times</a>, which has an <b>executive editor</b> over the news pages and an <b>editorial page editor</b> over opinion pages.</span></p>',
  +
4c508d7f6e64
  +
target: '<p id="8"><span data-segmentid="9" class="cx-segment"><a title="The New York Times" rel="mw:WikiLink" href="./The_New_York_Times" data-linkid="17" class="cx-link">The New York Times</a>, el cual tiene un <b>editor ejecutivo</b> sobre las páginas noticiosas y un <b>editor de página del editorial</b> encima páginas de opinión.</span></p>',
  +
  +
source: '<p id="8"><style>b{color:red;}</style></p>',
  +
target: '<p id="8"><style>b{color:red;}</style></p>',
  +
</pre>
   
 
== Tests ==
 
== Tests ==

Revision as of 08:13, 24 June 2020

This page will follow the development of word bound blanks in the apertium stream format.

Features

Rationale

Formalism

Examples

Markup Handling

$ echo 'legal <b>persons</b>' | apertium en-es -f html
Personas <b>legales</b>

$ echo 'I <b>am</b> David' | apertium en-es -f html
Soy</b> David
Spanish: <p>Es <s>además</s> de Valencia.</p>
Catalan: <p>És <s>a més</s> de València.</p>
English: <p>The <b>big <i>red</i></b> dog</p>
Spanish: <p>El perro <b><i>rojo</i> grande</b></p>
<p>Bees <b>cannot</b> swim</p>
<p>Las Abejas <b>no pueden</b> nadar</p>
<a href="Conway">Conway</a> stated that young <a href="children">children</a>
<i>“understand <a href="Object_permanence">object permanence</a>.
<a href="Concealment">Concealed</a> <a href="Object">objects</a> feature in
their awareness.”</i><span typeof="mw:Extension/ref"><a href="#ref-5">[5]</a></span>
<b>(<a href="Nielsen">Nielsen</a> equivalence).</b>
<p><b><i>my sister</i><br/>lives</b> <u>in Wales</u></p>
<a id="foobar" href="http://example.com">Foo <b>bar</b>.</a>

Ideal Output:
<a id="foobar" href="http://example.com"><b>Бар</b> фоо.</a>
<b>The</b> <i>sister</i>'s <em>dog</em>

From [[1]]

source: '<p>A <b>Japanese</b> <i>BBC</i> article</p>',
target: '<p>Un artículo de <i>BBC</i> <b>japonés</b></p>',

source: '<div>A <b>modern</b> Britain.</div>',
target: '<div>Una Gran Bretaña <b>moderna</b>.</div>',

source: '<p>The <b>big <i>red</i></b> dog</p>',
target: '<p>El perro <b><i>rojo</i></b> <b>grande</b></p>',

source: '<p>He said "<i>I tile <a href="x">bathrooms</a>.</i>"</p>',
target: '<p>Diga que "<i>enladrillo</i> <i><a href="x">baños</a></i>."</p>',

source: '<p>The <b>big red</b> dog</p>',
target: '<p>El perro <b>rojo grande</b></p>',

source: '<p>The <b>big</b> <b>red</b> dog</p>',
target: '<p>El perro <b>rojo</b> <b>grande</b></p>',

source: '<p>The <a href="1">big</a> <a href="2">red</a> dog</p>',
target: '<p>El perro <a href="2">rojo</a> <a href="1">grande</a></p>',
		
source: '<p id="8"><span class="cx-segment" data-segmentid="9"><a class="cx-link" data-linkid="17" href="./The_New_York_Times" rel="mw:WikiLink" title="The New York Times">The New York Times</a>, which has an <b>executive editor</b> over the news pages and an <b>editorial page editor</b> over opinion pages.</span></p>',
4c508d7f6e64	
target: '<p id="8"><span data-segmentid="9" class="cx-segment"><a title="The New York Times" rel="mw:WikiLink" href="./The_New_York_Times" data-linkid="17" class="cx-link">The New York Times</a>, el cual tiene un <b>editor ejecutivo</b> sobre las páginas noticiosas y un <b>editor de página del editorial</b> encima páginas de opinión.</span></p>',

source: '<p id="8"><style>b{color:red;}</style></p>',
target: '<p id="8"><style>b{color:red;}</style></p>',	

Tests

Previous Attempts

References