Apertium-dixtools

From Apertium
Jump to navigation Jump to search

See also Crossdics


Download

$ svn co https://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-dixtools

Software prerequisites

You will need to install Ant and Java Development Kit 6 (JDK6)

$ sudo apt-get install ant sun-java6-jdk

Compiling

$ cd apertium-dixtools
$ ant jar

Note:

  • If you update from SVN its always a good idea to do 'ant clean' first.
  • 'ant jar' also attempts to do some testing of itself. This might fail, if someone made changes without ensuring that the tests runs. Just continue with installation and report the test failures to the list.

On a Mac, you can use the following command if you get an error about the J2SE platform not being set up in platform properties:

$ ant -Dplatforms.JDK_1.6.home=/usr jar

(if you want to put the full "/System/Library/Frameworks/JavaVM.framework/Versions/1.6.0/" (or whatever) path in there, first make a symlink from the .../1.6.0/Commands folder to .../1.6.0/bin, since ant expects javac to be in the bin-subdirectory of platforms...home)

Installing

$ sudo ant install


Notes for developers

Wishlist and notes for Apertium-dixtools

  • theres awful lot of code, much more than needed. another way of handling XML where you dont have to write classes (and formatting code!!) for each tag.
    • If you already have a XML schema (.xsd) for your XML file structure, JAXB (Java Api for XML Binding) might be your choice. You just run the .xsd through the JAXB compiler (xjc) and get a bunch of classes (yes, one class per tag/type, but you don't have to write them yourself). Then you use the JAXB marshaller to convert XML documents to object structures and vice versa (with optional validation support). The JAXB marshalling code is included in the Sun JRE since version 6, and the JAXB compiler is available under a dual GPL+CDDL license. I used JAXB a lot (both at work and for hobby projects) and I really like it. Of course, it is still your decision. --Mihi 19:18, 24 February 2009 (UTC)

There should be many more options, and ALL sub-commands should take a -fmt parameter where all could be specified:

  • 1line or multiline entries
  • indenting
  • also 1line on pardefs
  • multiwords -- one line or many lines
  • multiwords -- should they be separated

(because sometimes with complex multiwords you want to have them laid out differently and apart e.g. you have a section for verbs and it has first "simple" verbs, then it has the multiword verbs)

  • multiwords -- the simple verbs are one per line
  • multiwords -- and the multiword verbs are over several lines