Difference between revisions of "Transfuse"
Jump to navigation
Jump to search
(Created page with "'''Transfuse''' is used under the hood for Format handling on the edges of the pipeline. See * User:Khannatanmai/Wordbound_blanks * https://github.com/TinoDidrikse...") |
|||
(2 intermediate revisions by the same user not shown) | |||
Line 6: | Line 6: | ||
* https://github.com/TinoDidriksen/Transfuse |
* https://github.com/TinoDidriksen/Transfuse |
||
== Notes on wordblank format == |
|||
The pipe may not yield nested structures, nor will Transfuse give it nested structures |
|||
so <code><nowiki>[[t:text:SyTAKg]]nett[[/]][[t:text:SyTAKg]]lesar[[/]]</nowiki></code> is OK, while <code><nowiki>[[t:text:SyTAKg]]ne[[t:text:SyTAKg]]tt[[/]]lesar[[/]]</nowiki></code> is not. |
|||
== Debugging transfuse issues == |
|||
Transfuse runs the full pipeline with NUL flushing, so if you want to catch the intermediate output between stages of a mode, you need a little helper. Put this in $PATH with filename <code>teez</code>: |
|||
<pre> |
|||
#!/bin/sh |
|||
[ "$1" = -z ] && shift |
|||
[ -z "$1" ] && { echo "Expecting file as arg" >&2; exit 1; } |
|||
awk -v out="$1" 'BEGIN{RS="\0"}{printf "%s", $0 >> out; printf "%s", $0}' |
|||
</pre> |
|||
then you can <code>cp modes/foo-bar.mode modes/tf.mode</code> and insert <code>teez /tmp/tf1.log | </code> at the start and <code>| teez /tmp/tfN.log</code> etc. in the middle between pipeline steps in modes/tf.mode and do things like <code>APERTIUM_TRANSFUSE=yes apertium -f docx -u -d . tf /tmp/in.docx >/tmp/out.docx</code> and then inspect /tmp/tf?.log |
Latest revision as of 11:04, 16 June 2022
Transfuse is used under the hood for Format handling on the edges of the pipeline.
See
Notes on wordblank format[edit]
The pipe may not yield nested structures, nor will Transfuse give it nested structures
so [[t:text:SyTAKg]]nett[[/]][[t:text:SyTAKg]]lesar[[/]]
is OK, while [[t:text:SyTAKg]]ne[[t:text:SyTAKg]]tt[[/]]lesar[[/]]
is not.
Debugging transfuse issues[edit]
Transfuse runs the full pipeline with NUL flushing, so if you want to catch the intermediate output between stages of a mode, you need a little helper. Put this in $PATH with filename teez
:
#!/bin/sh [ "$1" = -z ] && shift [ -z "$1" ] && { echo "Expecting file as arg" >&2; exit 1; } awk -v out="$1" 'BEGIN{RS="\0"}{printf "%s", $0 >> out; printf "%s", $0}'
then you can cp modes/foo-bar.mode modes/tf.mode
and insert teez /tmp/tf1.log |
at the start and | teez /tmp/tfN.log
etc. in the middle between pipeline steps in modes/tf.mode and do things like APERTIUM_TRANSFUSE=yes apertium -f docx -u -d . tf /tmp/in.docx >/tmp/out.docx
and then inspect /tmp/tf?.log