Difference between revisions of "Translating subtitles"

From Apertium
Jump to navigation Jump to search
Line 1: Line 1:
If you want to translate '''subtitles''' with Apertium, you can use [[Apertium Subtitles]] to get translation suggestions one-by-one either from a local installation or from the server, but it might be more efficient to simply translate a full file at once, and then use your favorite subtitling application (e.g. Gaupol or Jubler) to [[post-edit]].
If you want to translate '''subtitles''' with Apertium, you can either


* use [[Apertium Subtitles]] to get translation suggestions one-by-one either from a local installation or from the server,
* install the Gaupol extension
* translate from the command line, and then use your favorite subtitling application (e.g. Gaupol or Jubler) to [[post-edit]].

==Gaupol extension==
This is in incubator. To install, simply check it out and put it in your Gaupol extensions directory. E.g. to install for just your user:

$ mkdir -p ~/.local/share/gaupol/extensions # make sure the directory exists
$ cd ~/.local/share/gaupol/extensions
$ svn co https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/gaupol apertium

==Translating subtitles from the command line==
There are no format filters for srt or sub files in Apertium, but with some trickery we can get it translated.
There are no format filters for srt or sub files in Apertium, but with some trickery we can get it translated.


We'll use Translate Toolkit's sub2po to turn our subtitle file into a po-file, then use Pology's pomtrans to translate that with our local Apertium installation, then Translate Toolkit's po2sub can turn it back into a subtitle file.
We'll use Translate Toolkit's sub2po to turn our subtitle file into a po-file, then use Pology's pomtrans to translate that with our local Apertium installation, then Translate Toolkit's po2sub can turn it back into a subtitle file.


==Install the prerequisites==
===Install the prerequisites===
We need Translate Toolkit, which needs Gaupol and chardet in order to perform the conversion, and Pology. And of course we need Apertium and a language pair; for that, see [[Minimal installation from SVN]]
We need Translate Toolkit, which needs Gaupol and chardet in order to perform the conversion, and Pology. And of course we need Apertium and a language pair; for that, see [[Minimal installation from SVN]]
===On Ubuntu===
====On Ubuntu====


$ apt-get install translate-toolkit gaupol python-chardet
$ apt-get install translate-toolkit gaupol python-chardet
Line 19: Line 31:
(see http://techbase.kde.org/Localization/Tools/Pology#About for more information on Pology)
(see http://techbase.kde.org/Localization/Tools/Pology#About for more information on Pology)


===On Arch Linux===
====On Arch Linux====


$ pacman -S translate-toolkit gaupol python2-chardet
$ pacman -S translate-toolkit gaupol python2-chardet
Line 27: Line 39:
$ yaourt -S pology-svn
$ yaourt -S pology-svn


==Convert, translate, convert back, post-edit==
===Convert, translate, convert back, post-edit===


Say you have the file <code>Sintel.es.srt</code> that you want to translate into Catalan.
Say you have the file <code>Sintel.es.srt</code> that you want to translate into Catalan.
Line 49: Line 61:
Now you can post-edit Sintel.ca.srt in Jubler or Gaupol or whatever we want. Alternatively, you can edit the actual po-file in a po-editor like Virtaal or Lokalize before the po2sub step, although then you won't be able to compare with the movie as easily.
Now you can post-edit Sintel.ca.srt in Jubler or Gaupol or whatever we want. Alternatively, you can edit the actual po-file in a po-editor like Virtaal or Lokalize before the po2sub step, although then you won't be able to compare with the movie as easily.


==A script that does it for you==
===A script that does it for you===
Put the script below in a file called e.g. <code>apertium-sub</code>, make it executable with <code>chmod +x apertium-sub</code> and put it in your $PATH. Then you can translate subtitles like this:
Put the script below in a file called e.g. <code>apertium-sub</code>, make it executable with <code>chmod +x apertium-sub</code> and put it in your $PATH. Then you can translate subtitles like this:



Revision as of 22:16, 7 February 2011

If you want to translate subtitles with Apertium, you can either

  • use Apertium Subtitles to get translation suggestions one-by-one either from a local installation or from the server,
  • install the Gaupol extension
  • translate from the command line, and then use your favorite subtitling application (e.g. Gaupol or Jubler) to post-edit.

Gaupol extension

This is in incubator. To install, simply check it out and put it in your Gaupol extensions directory. E.g. to install for just your user:

   $ mkdir -p ~/.local/share/gaupol/extensions # make sure the directory exists
   $ cd  ~/.local/share/gaupol/extensions
   $ svn co https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/gaupol apertium

Translating subtitles from the command line

There are no format filters for srt or sub files in Apertium, but with some trickery we can get it translated.

We'll use Translate Toolkit's sub2po to turn our subtitle file into a po-file, then use Pology's pomtrans to translate that with our local Apertium installation, then Translate Toolkit's po2sub can turn it back into a subtitle file.

Install the prerequisites

We need Translate Toolkit, which needs Gaupol and chardet in order to perform the conversion, and Pology. And of course we need Apertium and a language pair; for that, see Minimal installation from SVN

On Ubuntu

   $ apt-get install translate-toolkit gaupol python-chardet

And check out Pology from SVN:

   $ svn co svn://anonsvn.kde.org/home/kde/trunk/l10n-support/pology
   $ export PATH=$PWD/pology/bin:$PATH
   $ export PYTHONPATH=$PWD/pology:$PYTHONPATH
   $ . $PWD/pology/completion/bash/pology

(see http://techbase.kde.org/Localization/Tools/Pology#About for more information on Pology)

On Arch Linux

   $ pacman -S translate-toolkit gaupol python2-chardet

Pology is in the Arch User Repository; either download and build with makepkg or use yaourt:

   $ yaourt -S pology-svn

Convert, translate, convert back, post-edit

Say you have the file Sintel.es.srt that you want to translate into Catalan.

Convert it into a po-file like this:

  sub2po -i Sintel.es.srt -o Sintel.es-ca.po

This po-file will have the Spanish as the source text, and completely empty target entries. Run that po-file through Apertium:

  /opt/pology/bin/pomtrans -s es -t ca -T /usr/local/bin/apertium -M es-ca apertium Sintel.es-ca.po

where -T gives the path to your apertium installation, and -M is the mode to use. Now the po-file will have Catalan text in the target entries :-) Convert it back to a po-file, using the timestamps from the original subtitle file:

  po2sub --fuzzy -t Sintel.es.srt -i Sintel.es-ca.po -o Sintel.ca.srt

We give --fuzzy to include fuzzy entries. Pology's pomtrans by default (as it should) marks all machine translated text as fuzzy.

Now you can post-edit Sintel.ca.srt in Jubler or Gaupol or whatever we want. Alternatively, you can edit the actual po-file in a po-editor like Virtaal or Lokalize before the po2sub step, although then you won't be able to compare with the movie as easily.

A script that does it for you

Put the script below in a file called e.g. apertium-sub, make it executable with chmod +x apertium-sub and put it in your $PATH. Then you can translate subtitles like this:

   $ apertium-sub es-ca Sintel.es.srt Sintel.ca.srt


The script:

#!/bin/bash

# Put the correct paths to your programs here:
SUB2PO=/usr/bin/sub2po
PO2SUB=/usr/bin/po2sub
POMTRANS=/opt/pology/bin/pomtrans
APERTIUM=/usr/local/bin/apertium

# You shouldn't have to change anything below this line

if [ $# -ne 3 ]; then
    echo "Usage: bash $0 mode input.srt output.srt";
    echo "Example: bash $0 es-ca_valencia mymovie.es.srt mymovie.ca.srt";
    exit 1;
fi

mode="$1";
insub="$2";
outsub="$3";

pofile=$(mktemp -t $mode.XXXXXXXXXX.po);

echo "Converting to po..."
$SUB2PO -i "$insub" -o "$pofile"

echo "Translating..."
$POMTRANS -s src -t trg -M "$mode" -T $APERTIUM apertium $pofile
# The -s and -t don't actually matter since we override them with -M, 
# they just have to be non-empty.

echo "Converting back from po..."
$PO2SUB --fuzzy -t "$insub" -i $pofile -o "$outsub"
# Since pomtrans marked everything as fuzzy, tell po2sub to include fuzzy

See also