Apertium has moved from SourceForge to GitHub.
If you have any questions, please come and talk to us on #apertium on irc.freenode.net or contact the GitHub migration team.

Apertium separable/report2017

From Apertium
Jump to: navigation, search

[edit] Project description

The purpose of this project is to allow Apertium language-pair developers to better translate "seperable" or "discontiguous" multiwords. We do this by re-ordering word tokens before translation occurs. For example, "take something out" becomes "take out something" so that "take out" can be translated as a single unit.

To do this, a finite-state transducer was used. The transducer accepted certain patterns of words (paradigms), such as adj-noun or det-adj-noun, that could separate the multiword. If the pattern was accepted, then the transducer would output the re-ordered words for better translation quality.

[edit] Work done

  • all spacing, punctuation, and superblanks were preserved
  • (for language developers: have the language-data writer write it explicitly in the .lsx file)

[edit] Future work

See Lsx_module#Future_work.

Personal tools