Minimal installation from SVN

From Apertium
Jump to navigation Jump to search

This guide shows you how to download, configure, compile and install core apertium packages and language data. It assumes you've already installed the prerequisites for your system – if you have not, see the system-specific guides under Installation. If you run into trouble, see Installation troubleshooting.

Note: some pairs require more than the four packages describe here. See the bottom of this page if your language pair complains about lacking CG, HFST or language data like apertium-rus.

Installing apertium and a language pair

Download

For most language pairs, these are the packages you need:

  • lttoolbox
  • apertium
  • apertium-lex-tools
  • the language pair(s) your are interested in

Here are the commands if you would like the Esperanto-English pair:

svn checkout https://svn.code.sf.net/p/apertium/svn/trunk/lttoolbox
svn checkout https://svn.code.sf.net/p/apertium/svn/trunk/apertium
svn checkout https://svn.code.sf.net/p/apertium/svn/trunk/apertium-lex-tools
svn checkout https://svn.code.sf.net/p/apertium/svn/trunk/apertium-eo-en

Note: please make sure that the directory where you put these files (i.e. where you run the svn command) doesn't contain spaces and other special characters. That may cause errors while compiling/linking.

If you want another pair than eo-en, only the last line needs changing. To see the available 'released' language pairs, go to https://svn.code.sf.net/p/apertium/svn/trunk/ (pairs which are in development are in the incubator/nursery/staging subdirectories of https://svn.code.sf.net/p/apertium/svn/).

If a language pair has more dependencies than the three shown above, the README should mention it (and the autogen.sh step should fail with a message about what is missing). The bottom of this page has pointers on how to install other possible dependencies.

Set up environment

By default, Apertium is installed under the directory /usr/local, which requires root (sudo) access when installing. If that's fine with you, begin by pasting these lines into your terminal:

LD_LIBRARY_PATH=/usr/local/lib:${LD_LIBRARY_PATH}
export LD_LIBRARY_PATH
PKG_CONFIG_PATH=/usr/local/lib/pkgconfig:${PKG_CONFIG_PATH}
export PKG_CONFIG_PATH

You should also put those lines in your ~/.bashrc so you don't have to paste them into every terminal you open.

However, if you want it installed somewhere else or don't want to install it as root, instead paste these lines into your terminal:

PREFIX=$HOME/local # or wherever you want apertium stuff installed
LD_LIBRARY_PATH=$PREFIX/lib:${LD_LIBRARY_PATH}
export LD_LIBRARY_PATH
PKG_CONFIG_PATH=$PREFIX/lib/pkgconfig:${PKG_CONFIG_PATH}
export PKG_CONFIG_PATH

You should also put those lines in your ~/.bashrc so you don't have to paste them into every terminal you open.

Configure, build and install

The next step is to configure, build and install each of the modules you checked out, in this order:

  1. lttoolbox
  2. apertium
  3. apertium-lex-tools
  4. the language pair (e.g. apertium-eo-en)

cd to each of the directories before you run the the commands shown below.

If you didn't specify $PREFIX above, or don't know what this means, then do this in each directory:

./autogen.sh
make

Then, for all programs apart from the language pair, do:

sudo make install
sudo ldconfig

If you specified a $PREFIX (e.g. to avoid installing as root), then do this in each directory:

./autogen.sh --prefix=$PREFIX
make

Then, for all programs apart from the language pair, do:

make install ldconfig -n $PREFIX/lib


(If you're on a Mac, you don't need to do ldconfig, don't worry that it fails.)


If you had any trouble, see Installation troubleshooting.

Test

Now test that it works. The command apertium -l should show a list of translation directions, of the form "from-to". Pick one, and do

echo 'This is a test sentence.' | apertium from-to

replacing from-to with the direction you want.

You can see development translation modes if you do ls modes from the language pair directory. If you're in the language pair directory, and there is e.g. a file modes/eo-en-tagger.mode, you can run the translator up until the tagger by typing

echo 'This is a test sentence.' | apertium -d . eo-en-tagger

The -d . means "use the language data in this directory", and is useful if you don't want to type make install all the time.

For language pairs that depend on monolingual packages (apertium-XYZ)

Many language pairs now have their monolingual data in separate packages (so that when several pairs have one language in common, we don't have to duplicate the data). If a pair depends on a monolingual package, the README should say so, and also the autogen.sh step should fail with a message like

No package 'apertium-XYZ' found

(where XYZ is some language code).

Monolingual packages are typically kept in https://svn.code.sf.net/p/apertium/svn/languages/ (more info at Languages) and compiled like the other packages. If a monolingual package installs a dictionary, the language pair uses that installed dictionary when compiling. However, to avoid having to type make install in the monolingual directory after every change there, you can tell the language pair the exact location to the monolingual package, and it will use the dictionary from that directory instead of the installed one. This is recommended for developers.

Imagine the language pair is called apertium-fie-bar, and it depends on the monolingual packages apertium-fie and apertium-bar. Assuming we have already installed lttoolbox, apertium and apertium-lex-tools as shown above, these would be the steps to download, configure and install apertium-fie-bar:

svn checkout https://svn.code.sf.net/p/apertium/svn/trunk/apertium-fie-bar
svn checkout https://svn.code.sf.net/p/apertium/svn/languages/apertium-fie
svn checkout https://svn.code.sf.net/p/apertium/svn/languages/apertium-bar

cd apertium-fie
./autogen.sh
make
cd ..

cd apertium-bar
./autogen.sh
make
cd ..

cd apertium-fie-bar
./autogen.sh --with-lang1=../apertium-fie --with-lang2=../apertium-bar
make

The --with-lang1 is used to give the path to where you checked out apertium-fie. If you do ./autogen.sh --help, it will tell you the possible --with-langN options and what they correspond to.

The process is similar for other language pairs that use monolingual packages.

For language pairs that use CG (vislcg3 / cg-proc / cg-comp)

Many language pairs now use Constraint Grammar (e.g. Macedonian→English, Breton→French, Nynorsk-Bokmål, …). For these, you need vislcg3 beforehand. See Vislcg3#Installing_VISL_CG3 for installation (use ./cmake.sh -DCMAKE_INSTALL_PREFIX=<prefix> if you're installing to a prefix).

Note that you have to have ICU installed beforehand (available through most GNU/Linux package managers, in Arch Linux as icu, in Debian/Ubuntu as libicu-dev, in Macports as icu).


For language pairs that use HFST (hfst-proc / hfst-lexc / hfst-twolc)

Many language pairs now use HFST (e.g. the Turkic and Saami ones). For these, you need hfst and foma beforehand. Follow the installation guides first for Foma, then HFST.


See also