Lttoolbox-java

From Apertium
Revision as of 10:54, 24 November 2009 by Jacob Nordfalk (talk | contribs)
Jump to navigation Jump to search

What is lttoolbox

lttoolbox are 1) making binary files out of the .dix files (lt-comp), 2) analysing or generating text (lt-proc) and 3) expanding a .dix file (lt-expand).

Reasons for a Java port

  • There are several devices (mobile phones, for example) which can run quite complicated software, but only if written in Java. lttoolbox is the first step to having Apertium run on these devices.
  • Windows port. It won't be as powerfull as Unix based system, but it will be there
  • Apertium will be the first MT system *ever* that can be demonstradet within a Java applets
  • Transfer in bytecode has a promise of speedup factor 4 - compared to what we use now (interpreted XML). And transfer CPU usage is dominating when processing large amounts of text

State fo Java port

j@j-laptop-nova:~/esperanto/apertium/lttoolbox-java/testdata/regression$ ./compare_java_and_c.sh
C analysis is... 0.41sec
OK
Java analysis is... 3.13sec
OK
C generator -g is ... 0.34sec
OK
Java generator -g is ... 2.31sec
OK
C generator -d is ... 0.32sec
OK
Java generator -d is ... 2.09sec
OK
C generator -n is ... 0.32sec
OK
Java generator -n is ... 2.56sec
OK
C postgenerator -p is ... 0.04sec
OK
Java postgenerator -p is ... 1.19sec
OK
All tests passed

--Jacob Nordfalk 10:52, 24 November 2009 (UTC)


Features

  • Binary compatibility with lttoolbox. lttoolbox-java is able _read_ and _write_ the binary files lttoolbox and generates exactly the same output
  • There is a comprehensive test suite that tests both lttoolbox (C++) and lttoolbox-java.


Other notes

<Drew_> jacobEo: I can't find a main class in the source code, am I looking in the wrong place? :S
<jacobEo> Drew_: LTComp.java, LTExpand.java, LTProc.java

Thanks

Nic Cottrell contributed an initial version of a Java port of lttoolbox. During GSOC2009 Raphael and Sergio worked on it, but processing still didnt work (compilation and expansion worked) November 2009 Jacob Nordfalk finished it up and optimized it