Difference between revisions of "Tips for translators"
		
		
		
		
		
		
		Jump to navigation
		Jump to search
		
				
		
		
		
		
		
		
		
	
| Line 1: | Line 1: | ||
| This page collects practical tips and tricks for ''using'' apertium as a translator. | This page collects practical tips and tricks for ''using'' apertium as a translator. | ||
| {{TOCD}} | {{TOCD}} | ||
| ==How do I make the translator ignore certain strings?== | ==How do I make the translator ignore certain strings?== | ||
| Line 26: | Line 28: | ||
| Avui  <a id="foo" href=http://time.org/>la data</a> és March 12è</pre> | Avui  <a id="foo" href=http://time.org/>la data</a> és March 12è</pre> | ||
| ==How do I use my translation memory (TMX) with Apertium?== | |||
| ⚫ | |||
| ==See also== | ==See also== | ||
| ⚫ | |||
| * [[Translating QT Linguist TS-files]] for how to translate .ts files | |||
| * [[Translating gettext]] for how to translate .po files | * [[Translating gettext]] for how to translate .po files | ||
| * [[Translating subtitles]] | * [[Translating subtitles]] | ||
| * [[Translating wikimedia]] | * [[Translating wikimedia]] | ||
| * [[Format handling]] for a list of supported input/output formats | * [[Format handling]] for a list of built-in supported input/output formats | ||
Revision as of 07:53, 5 April 2014
This page collects practical tips and tricks for using apertium as a translator.
How do I make the translator ignore certain strings?
Use one of the XML based modes, e.g. html and put <apertium-notrans> tags around the text you don't want translated. E.g.
$ echo "Translate me <apertium-notrans>don't translate me</apertium-notrans> but translate me" |apertium en-es -f html Me traduzco <apertium-notrans>don't translate me</apertium-notrans> pero traducirme
The HTML format adds entities, I want plain (Unicode) symbols
When using the HTML format, most non-ASCII characters are turned into HTML entities:
$ echo "Today's <a id="foo" href="http://time.org"/>date</a> is March 12th" |apertium -f html en-ca Avui <a id="foo" href=http://time.org/>la data</a> és March 12è
This might not be preferable.
You can use the html-noent mode instead to avoid this.
With older versions of apertium you have to use this hack: With have perl and perl-html-parser installed, you can append the following little script to the command:
perl -we 'use HTML::Entities;binmode(STDOUT,":utf8");while(<STDIN>){print decode_entities($_);}'
e.g.
$ echo "Today's <a id="foo" href="http://time.org"/>date</a> is March 12th" |apertium -f html en-ca|perl -we 'use HTML::Entities; binmode(STDOUT, ":utf8");while(<STDIN>) { print decode_entities($_); }'
Avui  <a id="foo" href=http://time.org/>la data</a> és March 12è
See also
- Translation memory for translating TMX / .tmx files
- Translating QT Linguist TS-files for how to translate .ts files
- Translating gettext for how to translate .po files
- Translating subtitles
- Translating wikimedia
- Format handling for a list of built-in supported input/output formats

