Difference between revisions of "Transfuse"

From Apertium
Jump to navigation Jump to search
Line 6: Line 6:
   
 
* https://github.com/TinoDidriksen/Transfuse
 
* https://github.com/TinoDidriksen/Transfuse
  +
== Notes on wordblank format ==
  +
  +
  +
The pipe may not yield nested structures, nor will Transfuse give it nested structures
  +
  +
so <code><nowiki>[[t:text:SyTAKg]]nett[[/]][[t:text:SyTAKg]]lesar[[/]]</nowiki></code> is OK, while <code><nowiki>[[t:text:SyTAKg]]ne[[t:text:SyTAKg]]tt[[/]]lesar[[/]]</nowiki></code> is not.
  +
   
 
== Debugging transfuse issues ==
 
== Debugging transfuse issues ==

Revision as of 11:21, 3 June 2022

Transfuse is used under the hood for Format handling on the edges of the pipeline.

See

Notes on wordblank format

   The pipe may not yield nested structures, nor will Transfuse give it nested structures

so [[t:text:SyTAKg]]nett[[/]][[t:text:SyTAKg]]lesar[[/]] is OK, while [[t:text:SyTAKg]]ne[[t:text:SyTAKg]]tt[[/]]lesar[[/]] is not.


Debugging transfuse issues

Transfuse runs the full pipeline with NUL flushing, so if you want to catch the intermediate output between stages of a mode, you need a little helper. Put this in $PATH with filename teez:

#!/bin/sh
[ "$1" = -z ] && shift

[ -z "$1" ] && { echo "Expecting file as arg" >&2; exit 1; }

awk -v out="$1" 'BEGIN{RS="\0"}{printf "%s", $0 >> out; printf "%s", $0}'

then you can cp modes/foo-bar.mode modes/tf.mode and insert `teez /tmp/tf1.log | ` at the start and `| teez /tmp/tfN.log` etc. in the middle between pipeline steps in modes/tf.mode and do things like APERTIUM_TRANSFUSE=yes apertium -f docx -u -d . tf /tmp/in.docx >/tmp/out.docx and then inspect /tmp/tf?.log