Translating man pages
Man pages format is not directly carried out by Apertium package. It is necessary to recover the package apertium-c-formatters to have a deformatter which takes into account specificities of this format.
Contents
Getting and installing apertium-c-formatters package
If you did not already recover apertium-c-formatters package for translating mnemonic files, let start by making the following operations:
Download apertium-c-formatters from GIT:
git clone --depth 1 https://github.com/apertium/apertium-c-formatters
Compile the source files:
make
Yes, compilation is very simple, and fast!
Install the tools:
make install
The above command assumes you have write access to /usr/local/bin
and /usr/local/share/man
. If not, enter the command:
sudo make install
By defaut, as for lttoolbox, apertium, and the language pairs, the installation is done in /usr/local/bin
and /usr/local/share
. If you wish to change the installation directory, you will have to change the first line of the makefile.
install_dir=/usr/local
will be replaced by the parent directory where the executable commands will be. For example, if you put:
install_dir=/usr
tools will be installed in /usr/bin
and man pages in /usr/share/man
.
Using
Available tools
Available tools to translate the man pages are the following:
- desman : deformatter for man pages,
- reman : generic reformatter used for man pages,
- apertium-man : a shell intended to make the translation of man pages easier.
Translation by calling the various tools
Assuming we are in the directory /usr/share/man
and we want to translate into Spanish the English version of the man page of the command desman that will be copied directly at the good place. We can type:
cat man1/desman.1 | desman | apertium -f none en-es | reman > es/man1/desman.1
We can notice that the command apertium
is executed with the -f none
option, which is necessary for the deformatter apertium-destxt
not to be used.
We can also give file names as parameters of the deformatter and the reformatter:
desman man1/desman.1 | apertium -f none en-es | reman - es/man1/desman.1
We can notice the dash -
as the first parameter of reman
. Reformatters usually use the standard input (in this case the result produced by the command apertium
) whereas it can be useful to preserve the result of reformatting in a file. But the problem is the general syntax of apertium reformatters require to specify the output file as the 2nd parameter. The -
as the first parameter permit to overcome this problem.
More simple with the command apertium-man
When a data format is integrated directly into the apertium
command, there is the -f
option to translate data produced in this format without having to call "by hand" a deformatter and a reformatter. The command apertium-man
is a shell which permits to do in the same way.
For example, to translate into Esperanto the English version of the man page of the command apertium-man we can type:
cat man1/apertium-man.1 | apertium-man en-eo > eo/man1/apertium-man.1
or even better:
apertium-man en-eo man1/apertium-man.1 eo/man1/apertium-man.1
The options -u
and -d datadir
of the command apertium
are caried out by apertium-man
. You just have place them before the parameter indicating the translation direction (as for the command apertium
).
Characteristics of the deformatter for man pages
Some lines of man pages starts with a mnemonics in capital letters preceded by a period. We can find in particular:
- .TH
- .SH
- .TP
- .B
- .I
- .PP
These mnemonic relates to the page layout of the document and do not have to be modified during the translation. The deformatter desman
mark as text to retain unchanged all beginnings of line starting with a period followed by one or more letters.
In addition, the commands described in a man page can accept options: generally a letter preceded by a -
. To avoid the appearance of undesirable * (or even sometimes worse with some letters and some source languages), the deformatter desman
mark as text to retain unchanged the -
immediately followed (without space) of one or more alphanumerics characters.