Minimal installation from SVN
This guide shows you how to download, configure, compile and install core apertium packages and language data. It assumes you've already installed the prerequisites for your system – if you have not, see the system-specific guides under Installation. If you run into trouble, see Installation troubleshooting.
Note: some pairs require more than the four packages describe here. See the bottom of this page if your language pair complains about lacking CG, HFST or language data like apertium-rus
.
Before You Do Anything!
Do you really need the core tools from svn? Ask yourself, what do you want to work on?
- Translation, language pairs, source/target languages: Return to Installation and see if you can use the binary packages for the core tools, and then skip the lttoolbox, apertium, apertium-lex-tools, cg3, hfst parts of this page and instead follow the next section.
- Core C++ shared tools: Go right ahead...
- Don't know / Not sure: Ask on IRC what you should install.
Installing just the SVN language data
If you've already got the core tools installed (apertium, cg, hfst; or the apertium-all-dev package), then there's a script that can download and setup language data (pair + possible monolingual dependencies) from SVN for you. Just go to the directory where you want your apertium data to be, and run
wget https://raw.githubusercontent.com/unhammer/apertium-get/master/apertium-get chmod +x apertium-get ./apertium-get fie-bar
where "fie-bar" is the name of the language pair you want to work on, and you'll have the data correctly set up under your current directory.
Ask on IRC if there are problems.
Installing apertium and a language pair
Download
For most language pairs, these are the packages you need:
- lttoolbox
- apertium
- apertium-lex-tools
- the language pair(s) your are interested in
Here are the commands if you would like the Esperanto-English pair:
svn checkout https://svn.code.sf.net/p/apertium/svn/trunk/lttoolbox svn checkout https://svn.code.sf.net/p/apertium/svn/trunk/apertium svn checkout https://svn.code.sf.net/p/apertium/svn/trunk/apertium-lex-tools svn checkout https://svn.code.sf.net/p/apertium/svn/trunk/apertium-eo-en
Note: please make sure that the directory where you put these files (i.e. where you run the svn command) doesn't contain spaces and other special characters. That may cause errors while compiling/linking.
If you want another pair than eo-en, only the last line needs changing. To see the available 'released' language pairs, go to https://svn.code.sf.net/p/apertium/svn/trunk/ (pairs which are in development are in the incubator/nursery/staging subdirectories of https://svn.code.sf.net/p/apertium/svn/).
If a language pair has more dependencies than the three shown above, the README
should mention it (and the autogen.sh
step should fail with a message about what is missing). The bottom of this page has pointers on how to install other possible dependencies.
Set up environment
By default, Apertium is installed under the directory /usr/local
, which requires root (sudo) access when installing. If that's fine with you, begin by pasting these lines into your terminal:
LD_LIBRARY_PATH=/usr/local/lib:${LD_LIBRARY_PATH} export LD_LIBRARY_PATH PKG_CONFIG_PATH=/usr/local/lib/pkgconfig:${PKG_CONFIG_PATH} export PKG_CONFIG_PATH
You should also put those lines in your ~/.bashrc
so you don't have to paste them into every terminal you open.
However, if you want it installed somewhere else or don't want to install it as root, instead paste these lines into your terminal:
PREFIX=$HOME/local # or wherever you want apertium stuff installed LD_LIBRARY_PATH=$PREFIX/lib:${LD_LIBRARY_PATH} export LD_LIBRARY_PATH PKG_CONFIG_PATH=$PREFIX/lib/pkgconfig:${PKG_CONFIG_PATH} export PKG_CONFIG_PATH
You should also put those lines in your ~/.bashrc
so you don't have to paste them into every terminal you open.
Configure, build and install
The next step is to configure, build and install each of the modules you checked out, in this order:
lttoolbox
apertium
apertium-lex-tools
- the language pair (e.g.
apertium-eo-en
)
cd
to each of the directories before you run the the commands shown below.
If you didn't specify $PREFIX
above, or don't know what this means, then do this in each directory:
./autogen.sh make
Then, for all programs apart from the language pair, do:
sudo make install sudo ldconfig
If you specified a $PREFIX
(e.g. to avoid installing as root), then do this in each directory:
./autogen.sh --prefix=$PREFIX make
Then, for all programs apart from the language pair, do:
make install ldconfig -n $PREFIX/lib
(If you're on a Mac, you don't need to do ldconfig, don't worry that it fails.)
If you had any trouble, see Installation troubleshooting.
Test
Now test that it works.
You can see development translation modes if you do ls modes
from the language pair directory. If you're in the language pair directory, and there is e.g. a file modes/eo-en-tagger.mode
, you can run the translator up until the tagger by typing
echo 'This is a test sentence.' | apertium -d . eo-en-tagger
The full pipeline is typically named e.g. eo-en:
echo 'This is a test sentence.' | apertium -d . eo-en
The -d .
means "use the language data in this directory".
If you are a user who just wants to translate, and not hack on the language pair, you can do (sudo) make install
from the language pair directory – this lets you do echo 'This is a test sentence' | apertium eo-en
without the -d, from whatever directory you're in.
Developers should not do this, since most new/incubator language pairs don't work with installation :-)
For language pairs that depend on monolingual packages (apertium-XYZ)
Many language pairs now have their monolingual data in separate packages (so that when several pairs have one language in common, we don't have to duplicate the data). If a pair depends on a monolingual package, the README should say so, and also the autogen.sh
step should fail with a message like
No package 'apertium-XYZ' found
(where XYZ is some language code).
Monolingual packages are typically kept in https://svn.code.sf.net/p/apertium/svn/languages/ (more info at Languages) and compiled like the other packages.
If a monolingual package installs a dictionary, the language pair uses that installed dictionary when compiling. However, to avoid having to type make install
in the monolingual directory after every change there, you can tell the language pair the exact location to the monolingual package, and it will use the dictionary from that directory instead of the installed one. This is recommended for developers.
Imagine the language pair is called apertium-fie-bar, and it depends on the monolingual packages apertium-fie and apertium-bar. Assuming we have already installed lttoolbox, apertium and apertium-lex-tools as shown above, these would be the steps to download, configure and install apertium-fie-bar:
svn checkout https://svn.code.sf.net/p/apertium/svn/trunk/apertium-fie-bar svn checkout https://svn.code.sf.net/p/apertium/svn/languages/apertium-fie svn checkout https://svn.code.sf.net/p/apertium/svn/languages/apertium-bar cd apertium-fie ./autogen.sh cd .. cd apertium-bar ./autogen.sh cd .. cd apertium-fie-bar ./autogen.sh --with-lang1=../apertium-fie --with-lang2=../apertium-bar # Now you can compile; using "make langs" in the pair will first compile the monolingual data, then the pair itself: make -j3 langs
The --with-lang1
is used to give the path to where you checked out apertium-fie. If you do ./autogen.sh --help
, it will tell you the possible --with-langN
options and what they correspond to.
The process is similar for other language pairs that use monolingual packages.
For language pairs that use CG (vislcg3 / cg-proc / cg-comp)
Many language pairs now use Constraint Grammar (e.g. Macedonian→English, Breton→French, Nynorsk-Bokmål, …). For these, you need vislcg3
beforehand. See Vislcg3#Installing_VISL_CG3 for installation (use ./cmake.sh -DCMAKE_INSTALL_PREFIX=<prefix>
if you're installing to a prefix).
Note that you have to have ICU installed beforehand (available through most GNU/Linux package managers, in Arch Linux as icu
, in Debian/Ubuntu as libicu-dev
, in Macports as icu
).
For language pairs that use HFST (hfst-proc / hfst-lexc / hfst-twolc)
Many language pairs now use HFST (e.g. the Turkic and Saami ones). For these, you need hfst
beforehand. Follow the installation guides first for HFST. HFST is actually created as a set of wrappers over several possible back-ends, Foma, OpenFST, SFST, …. The latest versions of HFST include the back-ends you need, so there's no reason to install any of these backends separately.
See also
- Installation – prerequisites and specific info for many different operating systems
- Installation Troubleshooting