English and Esperanto/Rejected tests

From Apertium
Jump to navigation Jump to search

Cxi tie ni metas problemojn kiun ni ne volas solvi gxuste nun, eble cxar estas nesolvelblaj (sen rombigi ion alian), eble cxar ni volas prokrastigi la penon solvi ilin, aux eble ni simple ne taksas ilin suficxe gravaj por solvi nun.

Eraroj pro malgxusta analizo - t.n. Tagging errors - ni provizore ne traktas - rejected for now

En Apertium la apertium-tagger elektas inter diversaj eblaj analizoj de vortoj.

Jen ekzemplo:

He has the rights to go. → Li havas la rajtojn iri. Bone, sed la sekva donas erare 'Li havas la ĝusta iri.':

  • (en) He has the right to go. → Li havas la rajton iri.

Tio estas cxar la analizo de la frazo estas tiel cxi:

^He/Prpers<prn><subj><p3><m><sg>$ ^has/have<vbhaver><pres><p3><sg>/have<vblex><pres><p3><sg>$ 
^the/the<det><def><sp>/the<det><def><sg>/the<det><def><pl>$ 
^right/right<adj>/right<adv>/right<n><sg>$  ^to/to<pr>$ ^go/go<vblex><inf>/go<vblex><pres>$^./.<sent>$

el kiu la apertium-tagger elektas malgxuste la eblon right<adj> anstataux la gxusta right<n><sg>:

^Prpers<prn><subj><p3><m><sg>$ ^have<vblex><pres><p3><sg>$ ^the<det><def><sp>$ ''^right<adj>$'' ^to<pr>$ ^go<vblex><inf>$^.<sent>$


  • (en) Being able to identify individual children in GCompris means that we can provide individual reports. → Estanta kapabla identigi individuajn infanojn en GCompris signifas ke ni povas provizi individuajn raportojn.

Tagger elektas mean<n> -> mezajxo, sed devus elekti mean<vblex>


  • (en) on the left side. → Je la maldekstra flanko.


but<adv> is chosen waaay too often

Therefore its commented out for now, but before retrain it must be commented in again.

but<adv> (-> krom<adv>) should only occur in sentences like

  • All but the children worked a lot.
  • All except the children worked a lot.
  • The battle was all but lost
  • the timesharing industry had all but disappeared
  • an all but impregnable stronghold

and but<cnjcoo> (-> sed<cnjcoo>) should only occur in sentences like

  • but soon valid genealogical sources all but dry up

Misc

(10:19:15) jacob: en-eo	  You can save multiple configurations, and switch between them easily. 
	- Vi povas sekurigi multajn agordojn, kaj ŝalti inter ili facile. 
	+ Vi povas savi *multiple agordoj, kaj ŝanĝo inter ilin facile.
(10:20:00) jacob: Why is "switch" ido you see him?n "switch between them" considered a noun?
(10:20:16) jacob: (ŝanĝo)
(10:20:19) francis: did you put it in the testing interface ?
(10:20:42) francis: the sentence
(10:20:47) francis: ^and/and<cnjcoo>$ ^switch/switch<n><sg>/switch<vblex><inf>/switch<vblex><pres>$ ^between/between<pr>$
(10:20:48) francis:  
(10:20:54) francis: ^and<cnjcoo>$ ^switch<n><sg>$ ^between<pr>$
(10:20:54) francis:  
(10:21:04) francis: the options for "switch" are noun, verb 
(10:21:07) francis: it chooses noun
(10:21:13) francis: the tagger works on a statistical basis
(10:22:55) jacob: But "easily" can only be there if "switch" is a verb.
(10:23:34) jacob: "There is a switch between them"  (noun)
(10:23:51) jacob: "switch between them" (noun or verb)
(10:24:11) jacob "switch between them easily" (can only be verb)

"switch between them easily" (can only be verb) - not true. 'You can put a switch between them easily' -- Jimregan 18:34, 15 September 2008 (UTC)

  • (en) I can repair it so that it works → Mi povas ripari ĝin tiel ke ĝi laboras

^it/prpers<prn><subj><p3><nt><sg>/prpers<prn><obj><p3><nt><sg>$ Tagger chooses the <obj>, therefore result is mi povas ripari ĝin tiel ke ĝin laboras

  • (en) You can add not only users but also classes. → Vi povas aldoni ne nur uzantojn sed ankaŭ klasojn.

The tagger selects the adjective form of only. It should select the adv form.

  • (en) After reports earlier this year → Post raportoj pli frua ĉi tiu jaro

should choose ^After/After<pr>$


referring to earlier parts of a long phrase (tagging error on "switch"):

  • (en) You can save multiple configurations, and switch between them easily. → Vi povas konservi multoblajn agordojn, kaj facile salti inter ili.


"Note that" + sentence => "Notu, ke" + sentence

  • (en) Note that you can import users from a comma-separated file. → Notu, ke vi povas importi uzantojn de komo-apartigita dosiero.

Tagger: Infinitive or present coming instead of imperative

  • (en) Just untoggle them in the treeview. → Simple malelektu ilin en la arba vido
  • (en) In the 'Profile' section add a profile, then in the 'Board' section select the profile in the combobox, then select the boards you want to be active. → En la 'Profilo' sekcio aldonu profilon, poste en la 'Tabulo' sekcio elektu la profilon en la falmenuo, poste elektu la tabulojn kiujn vi volas aktivigi .
I have added
    <pardef n="accept__vblex">

    <e>
      <p>
        <l/> 
        <r><s n="vblex"/><s n="imp"/></r> 
      </p>
    </e>

but the apertium-tagger prefers the pnfinitive or present

	<jimregan2>	there is no imperative in English
	<spectie>	there kind of is
	<jimregan2>	not really
	<jacobn>	Look at the manual. Read the manual.
	<jimregan2>	there are hacky ways around it, but only in certain conditions
	<jacobn>	Thats imperatives
	<spectie>	well, it is formed with the bare infinitive
	<spectie>	http://en.wikipedia.org/wiki/Imperative_mood
	<jacobn>	is it possible to implement the "hacky ways" or copy them from another language?
	<spectie>	they don't work from english → spanish
	<spectie>	:/
	<jimregan2>	dunno yet
	<jimregan2>	no, copying from another language isn't an option, really
	<jimregan2>	I'll try to add them in en->fr
	<jimregan2>	that reuses the rules from en-es, so if I get a solution there, it should map cleanly enough to en-es and en-ca
	<jacobn>	Jim: So I will put imperative on "future tests" and wait (how long?) for you to get some experience on imperatives
	<jimregan2>	dunno


Numerals

  • (en) fourteen million, three hundred and eighty-five thousand, four hundred and seventy-six → dek kvar milionoj tri cent okdek kvin mil kvar cent sepdek ses
  • (en) fourteen million three hundred and eighty five thousand, four hundred and seventy six → dek kvar milionoj tri cent okdek kvin mil kvar cent sepdek ses
  • (en) fourteen million three hundred eighty five thousand four hundred seventy six → dek kvar milionoj tri cent okdek kvin mil kvar cent sepdek ses


Dictionary 'errors'

  • (en) You can save multiple configurations → Vi povas konservi multoblajn agordojn

Here 'save' -> savi and not 'konservi'. -- Not agreed. In Esperanto 'savi' == 'to save, to rescue', so 'konservi' == 'to save, to keep, to preserve' fits better.

Misc

  • (en) the unusually big cats → la nekutime grandaj katoj
  • (en) You set the default profile in the 'Profile' section, by choosing the profile you want, then clicking on the 'Default' button. → Vi fiksas la defaŭltan profilon en la 'Profilo'sekcio, elektante la profilon kiun vi volas, kaj poste alklakante la 'Defaŭlto' butonon.