Difference between revisions of "How Apertium Works"
m |
m |
||
(One intermediate revision by one other user not shown) | |||
Line 30: | Line 30: | ||
Text is tagged with PoS (Part of Speech) tags. This helps in translation of text as during translation, words may be translated to other words with different PoS tags. This would cause major errors. A verb being translated into a noun would mess up the whole translation, for example. |
Text is tagged with PoS (Part of Speech) tags. This helps in translation of text as during translation, words may be translated to other words with different PoS tags. This would cause major errors. A verb being translated into a noun would mess up the whole translation, for example. |
||
More to be written... |
|||
[[Category:Documentation]] |
|||
[[Category:Documentation in English]] |
Latest revision as of 17:08, 26 September 2016
How Apertium Works[edit]
In this example, apertium-en-ca will be used to demonstrate how text is translated.
Apertium takes text from a source, which it formats into text which Apertium can translate without affecting the formatting.
This is done by enclosing formatting with superblanks and escaping backslashes.
This book is a <b>great</b> read!
is formatted into:
$ echo -n "This book is a <b>great</b> read." | apertium-deshtml This book is a[ <b>]great[<\/b> ]read..[][
Note: There is no way for the formatter to know when there is no more input from stdin, so you may see an unclosed superblank. You can safely remove it.
All formatting and whitespace within superblanks are ignored.
Afterwards, the text goes through morphological analysis.
$ echo "This book is a[ <b>]great[<\/b> ]read..[]" | lt-proc en-ca.automorf.bin ^This/This<det><dem><sg>/This<prn><tn><mf><sg>$ ^book/book<n><sg>/book<vblex><inf>/book<vblex><pres>$ ^is/be<vbser><pri><p3><sg>$[ <b>]^a/a<det><ind><sg>$ ^great/great<adj><sint>$[<\/b> ]^read/read<vblex><inf>/read<vblex><pres>/read<vblex><past>/read<vblex><pp>$^./.<sent>$^./.<sent>$[]
Text is tagged with PoS (Part of Speech) tags. This helps in translation of text as during translation, words may be translated to other words with different PoS tags. This would cause major errors. A verb being translated into a noun would mess up the whole translation, for example.
More to be written...