Difference between revisions of "Bytecode for transfer"

Revision as of 00:08, 28 February 2010

A concrete example: Esperanto-English

So http://apertium.svn.sourceforge.net/viewvc/apertium/trunk/lttoolbox-java/testdata/transfer/apertium-eo-en.eo-en.t1x?view=markup becomes http://apertium.svn.sourceforge.net/viewvc/apertium/trunk/lttoolbox-java/src/org/apertium/transfer/generated/apertium_eo_en_eo_en_t1x.java?view=markup

which is compiled into Java bytecode and executed with the Java JIT (Just-in-time) compiler.

Parsing /home/j/esperanto/apertium-svn/apertium/trunk/lttoolbox-java/testdata/transfer/apertium-eo-en.eo-en.t1x
// WARNING: Attribute a_np_acr is not defined. Valid attributes are: [a_nom, a_prp, a_adv, a_adj, a_vrb, a_vrb2, a_det, a_ord, a_prn, a_tns, a_nepersonaj_tempoj, a_gen, a_prs, a_nbr, a_cas, lem, lemq, lemh, whole, tags, chname, chcontent, content]
// Replacing with error_UNKNOWN_ATTR - for <transfer default="chunk">/<section-def-macros>/<def-macro n="firstWord" npar="1">/<choose>/<when>/<test>/<equal>/<clip part="a_np_acr" pos="1" side="sl">
Compiling: javac -cp dist/lttoolbox.jar transfertest/res/lttoolbox-java/testdata/transfer/apertium_eo_en_eo_en_t1x.java

Here is a speed comparison:

Interpreted transfer took 91.59 secs
bytecode compiled transfer took 15.88 secs
Speedup factor: 5.76

Further work

The Java code have not been optimized for speed, so perhaps the real potential speedup is 6-8, or even a higher factor, if using a mixed mode (mixing C and Java code instead of doing pure-Java).
Memory usage is also higher than really needed. I.a.
The underlying library, lttoolbox-java, is using 50% of the CPU, and there are some well known performance issues which are fixable
The bytecode should be pulled thru an optimizer, like Soot
There is a zillion of Open Source Java bytecode interpreters to choose from, most prominent Sun's own and http://kaffe.org. Only Sun's have been tested. At least GCJ should be tried out.
A step for post-compiling to native code should be tried out.
With http://xmlvm.org/ there could be a way for iPhones as well
Considering that we have a full port lttoolbox, Apertium could be made to run purely on Java, enabling a wide range of platforms, i.a. Windows, phones (J2ME or Android), web pages, server systems. Only the tagger is missing for a full system.

@@ Line 40: / Line 40: @@
 * The underlying library, [[lttoolbox-java]], is using 50% of the CPU, and there are some well known performance issues which are fixable
 * The bytecode should be pulled thru an optimizer, like [http://www.sable.mcgill.ca/soot/tutorial/optimizer/index.html Soot]
+* There is a zillion of [http://www.google.com/search?hl=eo&q=Java+bytecode+interpreters+open+source Open Source Java bytecode interpreters] to choose from, most prominent Sun's own and http://kaffe.org. Only Sun's have been tested. At least GCJ should be tried out.
+* A step for post-compiling to native code should be tried out.
+* With http://xmlvm.org/ there could be a way for iPhones as well
 * Considering that we have a full port lttoolbox, Apertium could be made to run purely on Java, enabling a wide range of platforms, i.a. Windows, phones (J2ME or Android), web pages, server systems. Only the tagger is missing for a full system.

Difference between revisions of "Bytecode for transfer"

Revision as of 00:08, 28 February 2010

Contents

A concrete example: Esperanto-English

Further work

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools