Transfuse

From Apertium
Revision as of 11:04, 16 June 2022 by Unhammer (talk | contribs) (→‎Debugging transfuse issues)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Transfuse is used under the hood for Format handling on the edges of the pipeline.

See

Notes on wordblank format[edit]

   The pipe may not yield nested structures, nor will Transfuse give it nested structures

so [[t:text:SyTAKg]]nett[[/]][[t:text:SyTAKg]]lesar[[/]] is OK, while [[t:text:SyTAKg]]ne[[t:text:SyTAKg]]tt[[/]]lesar[[/]] is not.


Debugging transfuse issues[edit]

Transfuse runs the full pipeline with NUL flushing, so if you want to catch the intermediate output between stages of a mode, you need a little helper. Put this in $PATH with filename teez:

#!/bin/sh
[ "$1" = -z ] && shift

[ -z "$1" ] && { echo "Expecting file as arg" >&2; exit 1; }

awk -v out="$1" 'BEGIN{RS="\0"}{printf "%s", $0 >> out; printf "%s", $0}'

then you can cp modes/foo-bar.mode modes/tf.mode and insert teez /tmp/tf1.log | at the start and | teez /tmp/tfN.log etc. in the middle between pipeline steps in modes/tf.mode and do things like APERTIUM_TRANSFUSE=yes apertium -f docx -u -d . tf /tmp/in.docx >/tmp/out.docx and then inspect /tmp/tf?.log