Translating man pages

From Apertium
Jump to navigation Jump to search

En français

Man pages format is not directly carried out by Apertium package. It is necessary to recover the package apertium-c-formatters to have a deformatter which takes into account specificities of this format.

Getting and installing apertium-c-formatters package

If you did not already recover apertium-c-formatters package for translating mnemonic files, let start by making the following operations:

Download apertium-c-formatters from GIT:

git clone --depth 1 https://github.com/apertium/apertium-c-formatters

Compile the source files:

make

Yes, compilation is very simple, and fast!

Install the tools:

make install

The above command assumes you have write access to /usr/local/bin and /usr/local/share/man. If not, enter the command:

sudo make install

By defaut, as for lttoolbox, apertium, and the language pairs, the installation is done in /usr/local/bin and /usr/local/share. If you wish to change the installation directory, you will have to change the first line of the makefile.

install_dir=/usr/local

will be replaced by the parent directory where the executable commands will be. For example, if you put:

install_dir=/usr

tools will be installed in /usr/bin and man pages in /usr/share/man .

Using

Available tools

Available tools to translate the man pages are the following:

  • desman : deformatter for man pages,
  • reman : generic reformatter used for man pages,
  • apertium-man : a shell intended to make the translation of man pages easier.

Translation by calling the various tools

Assuming we are in the directory /usr/share/man and we want to translate into Spanish the English version of the man page of the command desman that will be copied directly at the good place. We can type:

cat man1/desman.1 | desman | apertium -f none en-es | reman > es/man1/desman.1

We can notice that the command apertium is executed with the -f none option, which is necessary for the deformatter apertium-destxt not to be used.

We can also give file names as parameters of the deformatter and the reformatter:

desman man1/desman.1 | apertium -f none en-es | reman - es/man1/desman.1

We can notice the dash - as the first parameter of reman . Reformatters usually use the standard input (in this case the result produced by the command apertium) whereas it can be useful to preserve the result of reformatting in a file. But the problem is the general syntax of apertium reformatters require to specify the output file as the 2nd parameter. The - as the first parameter permit to overcome this problem.

More simple with the command apertium-man

When a data format is integrated directly into the apertium command, there is the -f option to translate data produced in this format without having to call "by hand" a deformatter and a reformatter. The command apertium-man is a shell which permits to do in the same way.

For example, to translate into Esperanto the English version of the man page of the command apertium-man we can type:

cat man1/apertium-man.1 | apertium-man en-eo > eo/man1/apertium-man.1

or even better:

apertium-man en-eo man1/apertium-man.1 eo/man1/apertium-man.1

The options -u and -d datadir of the command apertium are caried out by apertium-man. You just have place them before the parameter indicating the translation direction (as for the command apertium).

Characteristics of the deformatter for man pages

Some lines of man pages starts with a mnemonics in capital letters preceded by a period. We can find in particular:

  • .TH
  • .SH
  • .TP
  • .B
  • .I
  • .PP

These mnemonic relates to the page layout of the document and do not have to be modified during the translation. The deformatter desman mark as text to retain unchanged all beginnings of line starting with a period followed by one or more letters.

In addition, the commands described in a man page can accept options: generally a letter preceded by a -. To avoid the appearance of undesirable * (or even sometimes worse with some letters and some source languages), the deformatter desman mark as text to retain unchanged the - immediately followed (without space) of one or more alphanumerics characters.

See also

Documentation about man pages format