Talk:Ideas for Google Summer of Code/Apertium separable

From Apertium
< Talk:Ideas for Google Summer of Code
Revision as of 12:44, 21 December 2021 by Unhammer (talk | contribs) (Created page with "Modifying all mwe's to use separable only works for language pairs that don't use lexical selection for words that can turn into multiwords, which seems unlikely. Apertium-sep...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Modifying all mwe's to use separable only works for language pairs that don't use lexical selection for words that can turn into multiwords, which seems unlikely. Apertium-separable is great for translating into more idiomatic language, and you can to a certain extent do "lexical selection" with it, but it's not a replacement for LRX. LRX rules can 1. overlap and 2. combine with weights (and there is a --trace mode). On the other hand, LSX rules are LRLM (no overlapping rules), and doing typical "lexical selection" tasks in LSX quickly becomes unwieldy (being a rewriter instead of a filter, you have more power, but debugging is harder). Perhaps a second, target-language separable that runs after LRX could give us the best of both worlds, rewriting only on the part after the first '/', but afaik separable can't yet do that (and one would have to invent some smart heuristics for what to do with the part before '/'). --unhammer (talk) 12:44, 21 December 2021 (UTC)