English and Esperanto

From Apertium
Revision as of 18:34, 15 September 2008 by Jimregan (talk | contribs) (→‎Tagging errors: "switch between them easily" (can only be verb) - not true.)
Jump to navigation Jump to search

Esperantistoj, b.v. vidu Peto al esperantistoj.


Intros to Esperanto

Perhaps http://en.wikiversity.org/wiki/Rules_of_Esperanto_grammar (or http://donh.best.vwh.net/Esperanto/rules.html) is a good overview.

And the affixes: http://esperanto.davidgsimpson.com/eo-affixes.html (short) http://steve-and-pattie.com/esperantujo/grparafx.html (longer)

Tenses are exlained in http://en.wikipedia.org/wiki/Esperanto_grammar#Verbs

Wordlists

http://freepages.rootsweb.ancestry.com/~wakefield/translations/engesp.html

http://www.mirrorservice.org/sites/download.sourceforge.net/pub/sourceforge/d/dm/dmdictionary/EngEsp.txt


Test set

<jacobn> Jim, Fran: I just looked at http://www.link.cs.cmu.edu/link/batch.html  and to me is looks like that "carefully selected text" I was talking about a week ago which would be needed to define the most important features to get covered.
<jimregan2> Jacob, it's ok
<jimregan2> we already have carefully selected text for English :)
<jimregan2> ALL + plural
<jimregan2> ALL + adj + plural
<jacobn> what do you think, is http://www.link.cs.cmu.edu/link/batch.html + their translations to Esperanto suitable as test set?
<jimregan2> as one test set, yes
<jimregan2> I have another few, and I promise I'll get to them on Wednesday
<jacobn> Jim, I would like to include a " carefully selected text for English" in the en-eo test set. Do you have a better suggestion than http://www.link.cs.cmu.edu/link/batch.html ?
<jacobn> Fine
<jimregan2> heck - I'll even set a reminder :)
<jacobn> no hast necessary
<jimregan2> newspaper - type text is best
<jimregan2> I'll grab a few chunks from different books at project gutenberg

<jimregan2> oh, you know about the '*' in the sentences, right?
<jacobn> the * ??
<jacobn> no, never met it
<jacobn> ;-)
<jimregan2> at the start of a lot of the sentences, there's a '*'
<jacobn> Oh that
<jacobn> yes, Ive read it
<jimregan2> that's a standard convention in linguistics to say 'this sentence is incorrect'
<jacobn> I would start with the non-* sentences
<jimregan2> just dump anything with '*'
<jimregan2> they're not worth any effort

Tagging errors

(10:19:15) jacob: en-eo	  You can save multiple configurations, and switch between them easily. 
	- Vi povas sekurigi multajn agordojn, kaj ŝalti inter ili facile. 
	+ Vi povas savi *multiple agordoj, kaj ŝanĝo inter ilin facile.
(10:20:00) jacob: Why is "switch" in "switch between them" considered a noun?
(10:20:16) jacob: (ŝanĝo)
(10:20:19) francis: did you put it in the testing interface ?
(10:20:42) francis: the sentence
(10:20:47) francis: ^and/and<cnjcoo>$ ^switch/switch<n><sg>/switch<vblex><inf>/switch<vblex><pres>$ ^between/between<pr>$
(10:20:48) francis:  
(10:20:54) francis: ^and<cnjcoo>$ ^switch<n><sg>$ ^between<pr>$
(10:20:54) francis:  
(10:21:04) francis: the options for "switch" are noun, verb 
(10:21:07) francis: it chooses noun
(10:21:13) francis: the tagger works on a statistical basis
(10:22:55) jacob: But "easily" can only be there if "switch" is a verb.
(10:23:34) jacob: "There is a switch between them"  (noun)
(10:23:51) jacob: "switch between them" (noun or verb)
(10:24:11) jacob "switch between them easily" (can only be verb)

"switch between them easily" (can only be verb) - not true. 'You can put a switch between them easily' -- Jimregan 18:34, 15 September 2008 (UTC)

Jacob TODO

<jacobn> Ok, Ill try the web doc translator more, find the systematics, report a bug and attach files etc.

See also