Language pair packages

From Apertium
Revision as of 18:59, 8 March 2015 by Ilnar.salimzyan (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Contents

(thumbnail)
The English ⇆ Spanish package running as a standalone Java application. The same file could be used from other client applications like Apertium-Caffeine or Apertium-OmegaT.

Language pair packages are standalone JARs that can be run independently as well as used from other client applications like Apertium-Caffeine or Apertium-OmegaT. The only prerequisite to use them is Java 6 or better (apertium, lttoolbox or lttoolbox-java are NOT required), and they can work on practically any platform (Linux, OS X, Windows and even Android!).

[edit] Internal structure

Since JAR files are nothing but renamed ZIP files, you can easily edit language pair packages to fit your needs. Note that the packages are ready to be used without any modification, so the vast majority of users will not get any notable advantage from doing it. In any case, editing packages could happen to be useful, for instance, in order to reduce their file size by removing unnecessary content.

The typical structure of a language pair package would be the following one:

  • data/: Directory containing the language pair itself. You could extract it and use with your local installation of Apertium.
  • transfer_classes/: Directory that contains the Java bytecode classes for transfer. This is only used when using the package from standard Java so, if you are going to use it exclusively from Android, you can delete it.
  • org/: Directory that contains the lttoolbox-java engine, which makes the package self-executable. If you are not interested on this feature (presumably, because you are going to use the package exclusively from client programs), you can delete it.
  • META-INF/: Directory that contains the MANIFEST.MF of this Jar, which is used by Java. It takes a few bytes and can rarely be removed, so please don't touch it unless you know what you are doing.
  • classes.dex: Dalvik bytecode of the transfer classes, used by Android instead of the standard Java bytecode classes at transfer_classes/. If you are not going to use the package from Android, you can delete it.
  • modes: Text file that lists the path of the available modes inside the package that is used by lttoolbox-java. It takes a few bytes and can rarely be removed, so please don't touch it unless you know what you are doing.
  • README: Text file describing the content of the package.

[edit] Creating language pair packages

In order to create language pair packages, a working installation of the last version of lttoolbox-java at SVN is required. Provided that you meet this requirement, you simply need to run apertium-pack-j passing the mode files for which you want to generate the package as argument. For instance, the following command would create a ready-to-use package for the Esperanto ⇆ English language pair named apertium-eo-en.jar (the first argument determines the name of the output file):

apertium-pack-j /usr/local/share/apertium/modes/eo-en.mode /usr/local/share/apertium/modes/en-eo.mode

[edit] Creating Android compatible packages

In order to create Android compatible packages, you need a working installation of the Android SDK. Once you have it, you need to specify its location by setting the ANDROID_SDK_PATH environment variable as follows:

export ANDROID_SDK_PATH="/home/mikel/developer/android-sdk-linux"

You will need to replace /home/mikel/developer/android-sdk-linux with the right path of your Android SDK installation.

Once you have done it, simply run apertium-pack-j as explained above, and the generated package will be compatible with Android devices.

[edit] List of ready-to-use packages

Out of the 31 released pairs, the following 25 have fully working and ready-to-use packages that are maintained under the builds/ directory at SVN. You can directly launch them in a Java enabled browser by clicking in the JWS links. You can also download the JARs and run them as standard Java applications or use them from a client application by clicking in the JAR links.

  • Afrikaans ⇆ Dutch (JWS, JAR)
  • Basque → English (JWS, JAR)
  • Basque → Spanish (JWS, JAR)
  • Catalan ⇆ Italian (JWS, JAR)
  • English ⇆ Catalan (JWS, JAR)
  • English ⇆ Galician (JWS, JAR)
  • English ⇆ Spanish (JWS, JAR)
  • Esperanto ← Catalan (JWS, JAR)
  • Esperanto ⇆ English (JWS, JAR)
  • Esperanto ← French (JWS, JAR)
  • Esperanto ← Spanish (JWS, JAR)
  • French ⇆ Catalan (JWS, JAR)
  • French ⇆ Spanish (JWS, JAR)
  • Haitian → English (JWS, JAR)
  • Occitan ⇆ Catalan (JWS, JAR)
  • Occitan ⇆ Spanish (JWS, JAR)
  • Portuguese ⇆ Catalan (JWS, JAR)
  • Portuguese ⇆ Galician (JWS, JAR)
  • Spanish ⇆ Aragonese (JWS, JAR)
  • Spanish ⇆ Asturian (JWS, JAR)
  • Spanish ⇆ Catalan (JWS, JAR)
  • Spanish ⇆ Galician (JWS, JAR)
  • Spanish ⇆ Portuguese (JWS, JAR)
  • Spanish ← Romanian (JWS, JAR)
  • Swedish → Danish (JWS, JAR)

[edit] Language pairs with external dependencies

The following 7 released pairs depend on CG:

  • Breton → French (apertium-br-fr)
  • Icelandic → English (apertium-is-en)
  • Macedonian ⇆ Bulgarian (apertium-mk-bg)
  • Macedonian → English (apertium-mk-en)
  • Norwegian Nynorsk ⇆ Bokmål (apertium-nn-nb)
  • Welsh → English (apertium-cy-en)
  • Kazakh ⇆ Tatar (apertium-kaz-tat)

Invoking external programs is supported by language pair packages, so it is still possible to create packages for these pairs. However, you will need to install CG in your machine for them to work. Due to this limitation, precompiled binaries are not offered for these pairs, but you can still create them by following the instructions in the previous section.

Personal tools