From Apertium
Revision as of 10:11, 30 March 2009 by Jacob Nordfalk (talk | contribs) (New page: Notes <jimregan> Nic Cottrell contributed a Java port of lttoolbox <jimregan> but it needs work to finish it <jimregan> and a test suite, in both C++ and Java <Drew_> ah, I've found the ...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


<jimregan> Nic Cottrell contributed a Java port of lttoolbox <jimregan> but it needs work to finish it <jimregan> and a test suite, in both C++ and Java <Drew_> ah, I've found the Lttoolbox page on the wiki <jimregan> ok <Drew_> this is a project I may be interested in - my specialty language is Java <jacobEo> Great Drew_ ! <Drew_> :) <jacobEo> its in in apertium-tools/lttoolbox-java <Drew_> Do you have any more information on it at the minute? <jacobEo> Drew_: What is in apertium-tools/lttoolbox-java right now <jacobEo> is NOT working. <jacobEo> in apertium-tools/lttoolbox-java is a line-for-line port of the C++ code of lttoolbox <jacobEo> and the great problem is the XML handling

<jimregan> it has to be binary compatible <Drew_> jacobEo, I will download Ubuntu now <jacobEo> jimregan: Did you look at the Java code? <jimregan> and the test suite has to be in both C++ and Java, to ensure that <jacobEo> ok. "medium" then, or if there is anything betw "easy" and "medium" choose that <jimregan> yeah, it's amost line for line identical to the C++, aside from Java/C++ differences <jimregan> but, the binary stuff can be hard <jacobEo> therefore jimregan its not that hard. <jimregan> all you need is one bit in the wrong place, and it's useless <jacobEo> jimregan: Binary stuff? <jimregan> medium, then <jimregan> jacobEo, yeah <jimregan> well <jimregan> the compression stuff <cseong> if i dont know one of the required language, for example is C,C++ and XML are the requirements and i dont know XML, can i still choose it ? <jimregan> and the transducer <jacobEo> jimregan: The binary stuff is _probably_ easy, as you can debug the C++ and compare variables etc <jimregan> cseong, XML is easy to pick up <jimregan> there are plenty of APIs availabl <jacobEo> cseong: Which project are you thinking of? <jimregan> for C++, we use libxml2

<jacobEo> Rah2: lttoolbox are making binary files out of the .dix files. <jacobEo> Rah2: lttoolbox-java needs to at least be able to _read_ these binary files. <Rah2> ok

  • vaasu (i=73548f22@gateway/web/ajax/ has joined #apertium

<jacobEo> Rah2: Did you try Apertium? Have a language pair installed? <Rah2> I just svn checked out

  • vaasu has quit (Client Quit)
  • Drew_ ( has joined #apertium

<jimregan> wow! <jimregan> Rah2, that was /fast/ <jimregan> I only finsihed adding that 5 minutes ago :) <Rah2> no in fact it wasn't <Rah2> It's like 900 Mo <Rah2> I took it all <Rah2> I just started before you mentionned that project <jimregan> no; I mean the Java lttoolbox idea :) <jacobEo> Rah2: Pls compile lttoolbox and apertium and a language pair of your choice. <Rah2> I was idling on that chan <jacobEo> Rah2: Then much more will be clear

<jacobEo> Rah2: You don't need much knowlede of MT or NLP to do lttoolbox-java. But you need to know C++ and Java and be able to debug both <Drew_> jacobEo: What was the location of lttoolbox again? <jacobEo> Drew_: With SVN or as a ZIP file? <CIA-18> apertium: nordfalk * r9192 /trunk/apertium-eo-en/apertium-eo-en.en-eo.t1x: Pli da simpligo. set_gender1 estas preskaux ne-necesa <Drew_> um, I am using Tortoise SVN, is there a ZIP file uploaded somewhere? <jacobEo> You can get SVN things as ZIP files. <Drew_> ah right <jacobEo> <jacobEo>

<jacobEo> "Download GNU tarball" will give a compressed archive

<jacobEo> The problem, I think, is the XML handling: The C code's library callback calls a method in the code both when it meets a START and an END tag. <jimregan> avinesh, maybe you should say it to spectie because I'm not interested in unicode -> wx <jimregan> not for any reason <jacobEo> the Java's XML library only calls the callback method at the START tag. <jimregan> jacobEo, that will be necessary for chunk merging <jimregan> we don't have it yet, but it will be necessary <Leftmost> jimregan, I guess I'm a bit unclear as to what form the regression tests should take. Simply translations between ga and gd? <jimregan> because when chcontent in t2 is written in chunk mode, it will be without { or }, otherwise with <avinesh> ok got it <jimregan> to fit the current model, that has to be a bool set and unset on entry/exit <Drew_> jacobEo: Is it a big job to make it work with the END tag? <avinesh> no wx right :D <jimregan> that's it <jimregan> avinesh, noone told me anudev is a course supervisor :/ <jacobEo> Drew_: I don't know. Perhaps we could find another Java XML library that could be made also call for the end tags. Or some kind of wrapper-inbetween thing could be made. Or you could use SAX and make your own callback thing. <jimregan> I think I would have expected more of his opinions if I knew he wasn't actually doing any of the work <avinesh> umm he mainly working on anusaraka <jacobEo> Drew_: There might be other problems. The project just got stranded on the XML parse part. <Drew_> jacobEo: Ah, ok. I'm just compiling it now <jacobEo> Drew_: You have to run the code to see. To do that you need to have at least one language pair runnning on your machine

  • vaasu (n=yt@ has joined #apertium

<Drew_> jacobEo: I can't find a main class in the source code, am I looking in the wrong place? :S <jacobEo> Drew_: The Java code? <cseong> uhm..i am interested in improving interoperability..but what formats are u refering to ? <Drew_> jacobEo: Yeah, I loaded the java code into eclipse but it can't find a main method to compile the .java's <jacobEo> Drew_:,, <jimregan> avinesh, yeah. So I was right when I thought he expected us to change all of apertium to suit the analyser :/

  • abhiSri (i=AB-Alway@ has joined #apertium

<jacobEo> Drew_: Use Netbeans if you can. It's kinda standard here in Apertium