Tips for translators
Jump to navigation
Jump to search
This page collects practical tips and tricks for using apertium as a translator.
How do I make the translator ignore certain strings?
Use one of the XML based modes, e.g. html and put <apertium-notrans>
tags around the text you don't want translated. E.g.
$ echo "Translate me <apertium-notrans>don't translate me</apertium-notrans> but translate me" |apertium en-es -f html Me traduzco <apertium-notrans>don't translate me</apertium-notrans> pero traducirme
The HTML format adds entities, I want plain (Unicode) symbols
When using the HTML format, most non-ASCII characters are turned into HTML entities:
$ echo "Today's <a id="foo" href="http://time.org"/>date</a> is March 12th" |apertium -f html en-ca Avui <a id="foo" href=http://time.org/>la data</a> és March 12è
This might not be preferable.
You can use the html-noent mode instead to avoid this.
With older versions of apertium you have to use this hack: With have perl and perl-html-parser installed, you can append the following little script to the command:
perl -we 'use HTML::Entities;binmode(STDOUT,":utf8");while(<STDIN>){print decode_entities($_);}'
e.g.
$ echo "Today's <a id="foo" href="http://time.org"/>date</a> is March 12th" |apertium -f html en-ca|perl -we 'use HTML::Entities; binmode(STDOUT, ":utf8");while(<STDIN>) { print decode_entities($_); }' Avui <a id="foo" href=http://time.org/>la data</a> és March 12è
See also
- Translation memory for translating TMX / .tmx files
- Translating QT Linguist TS-files for how to translate .ts files
- Translating gettext for how to translate .po files
- Translating subtitles
- Translating wikimedia
- Format handling for a list of built-in supported input/output formats