Difference between revisions of "English and Spanish"

From Apertium
Jump to navigation Jump to search
(New page: ==Lexis== * justo (es) → just (en) -- at the moment translates as "fair", which is good for "no es justo", but isn't good for "justo dos años" -- "just" in English would serve reasona...)
 
 
(10 intermediate revisions by 3 users not shown)
Line 1: Line 1:
  +
{{TOCD}}
 
==Lexis==
 
==Lexis==
  +
<pre>
 
  +
$ echo "Don't annoy your sister"| apertium en-es
 
  +
No molesta vuestro hermano
* justo (es) → just (en) -- at the moment translates as "fair", which is good for "no es justo", but isn't good for "justo dos años" -- "just" in English would serve reasonably well for the two.
 
  +
</pre>
   
   
Line 8: Line 10:
   
 
* In restoring the subject in English, with motion verbs, for the third person singular, the third person plural should be used (e.g. gender neutral [http://en.wikipedia.org/wiki/Singular_they singular they]), with non-motion verbs, the third person neuter, e.g.
 
* In restoring the subject in English, with motion verbs, for the third person singular, the third person plural should be used (e.g. gender neutral [http://en.wikipedia.org/wiki/Singular_they singular they]), with non-motion verbs, the third person neuter, e.g.
::"I've got a friend coming over for dinner"
+
::"I've got '''a friend''' coming over for dinner"
::"Oh, what time are they arriving?"
+
::"Oh, what time are '''they''' arriving?"
  +
  +
* Some way needs to be figured out to remove double spaces (e.g. when <code>^do<vbdo><pres>$</code> is deleted in the dictionary, and it leaves two spaces.
  +
  +
==Spanish tagger==
  +
In the sentence 'La GPL logra que nadie pueda mejorar el software', logra is analysed as:
  +
^logra/lograr<vblex><pri><p3><sg>/lograr<vblex><imp><p2><sg>$
  +
but the form:
  +
^lograr<vblex><imp><p2><sg>$
  +
is selected by the tagger.
  +
  +
This only covers the specific sentence I'm looking at, but the following additions should solve the problem (though it doesn't cover any adjectives or adverbs that may occur between noun and verb):
  +
  +
<def-label name="VLEXP12">
  +
<tags-item tags="vblex.*.p1.*"/>
  +
<tags-item tags="vblex.*.p2.*"/>
  +
<tags-item tags="vbser.*.p1.*"/>
  +
<tags-item tags="vbser.*.p2.*"/>
  +
<tags-item tags="vbhaver.*.p1.*"/>
  +
<tags-item tags="vbhaver.*.p2.*"/>
  +
<tags-item tags="vbmod.*.p1.*"/>
  +
<tags-item tags="vbmod.*.p2.*"/>
  +
</def-label>
  +
<def-label name="NOMANY">
  +
<tags-item tags="n.*"/>
  +
<tags-item tags="np.*"/>
  +
</def-label>
  +
  +
<label-sequence>
  +
<label-item label="NOMANY"/>
  +
<label-item label="VLEXP12"/>
  +
</label-sequence>
  +
  +
==Cleanup==
   
  +
* Remove apertium level-1 files (apertium-en-es.trules-es-en.xml etc.)
   
[[Category:Language pairs]]
+
[[Category:English and Spanish|*]]
  +
[[Category:English]]
  +
[[Category:Spanish]]

Latest revision as of 13:34, 10 December 2010

Lexis[edit]

$ echo "Don't annoy your sister"| apertium en-es
No molesta vuestro hermano


Transfer rules[edit]

  • In restoring the subject in English, with motion verbs, for the third person singular, the third person plural should be used (e.g. gender neutral singular they), with non-motion verbs, the third person neuter, e.g.
"I've got a friend coming over for dinner"
"Oh, what time are they arriving?"
  • Some way needs to be figured out to remove double spaces (e.g. when ^do<vbdo><pres>$ is deleted in the dictionary, and it leaves two spaces.

Spanish tagger[edit]

In the sentence 'La GPL logra que nadie pueda mejorar el software', logra is analysed as:

^logra/lograr<vblex><pri><p3><sg>/lograr<vblex><imp><p2><sg>$

but the form:

^lograr<vblex><imp><p2><sg>$

is selected by the tagger.

This only covers the specific sentence I'm looking at, but the following additions should solve the problem (though it doesn't cover any adjectives or adverbs that may occur between noun and verb):

 <def-label name="VLEXP12">
   <tags-item tags="vblex.*.p1.*"/>
   <tags-item tags="vblex.*.p2.*"/>
   <tags-item tags="vbser.*.p1.*"/>
   <tags-item tags="vbser.*.p2.*"/>
   <tags-item tags="vbhaver.*.p1.*"/>
   <tags-item tags="vbhaver.*.p2.*"/>
   <tags-item tags="vbmod.*.p1.*"/>
   <tags-item tags="vbmod.*.p2.*"/>
 </def-label>
 <def-label name="NOMANY">
   <tags-item tags="n.*"/>
   <tags-item tags="np.*"/>
 </def-label>
   <label-sequence>
     <label-item label="NOMANY"/>
     <label-item label="VLEXP12"/>
   </label-sequence>

Cleanup[edit]

  • Remove apertium level-1 files (apertium-en-es.trules-es-en.xml etc.)