Difference between revisions of "Odt2xliff"

From Apertium
Jump to navigation Jump to search
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
Here is a rough-and-ready script to turn the text from an ODT file into an XLIFF using Apertium and [[translate toolkit]].
Here is a rough-and-ready script to turn the text from an ODT file into an XLIFF using Apertium and [[translate toolkit]]. For <code>.po</code> just remove the final call to po2xliff.


<pre>
<pre>
Line 58: Line 58:
rm $TEMPO
rm $TEMPO
</pre>
</pre>

[[Category:Converters]]

Latest revision as of 17:22, 10 January 2008

Here is a rough-and-ready script to turn the text from an ODT file into an XLIFF using Apertium and translate toolkit. For .po just remove the final call to po2xliff.

#!/bin/sh

INPUT_FILE=""
OUTPUT_FILE=""

function translate_odt
{
  INPUT_TMPDIR=/tmp/$$odtdir

  export LC_CTYPE=$(locale -a|grep -i "utf[.]*8"|head -1);

  if [[ LC_CTYPE == "" ]]
  then echo "Error: Install an UTF-8 locale in your system";
       exit 1;
  fi

  if [[ $(which zip) == "" ]]
  then echo "Error: Install 'zip' command in your system";
       exit 1;
  fi
  
  if [[ $(which unzip) == "" ]]
  then echo "Error: Install 'unzip' command in your system";
       exit 1;
  fi
  
  if [[ $FICHERO == "" ]]
  then FICHERO=/tmp/$$odtorig
       cat > $FICHERO
       BORRAFICHERO="true"
  fi
  
  unzip -q -o -d $INPUT_TMPDIR $FICHERO
  find $INPUT_TMPDIR | grep content\\.xml |\
  awk '{printf "<file name=\"" $0 "\"/>"; PART = $0; while(getline < PART) printf(" %s", $0); printf("\n");}' |\
  apertium-desodt 
  VUELVE=$(pwd)
  cd $INPUT_TMPDIR
  cd $VUELVE

  rm -rf $INPUT_TMPDIR
}

INPUT_FILE=$1
FICHERO=$INPUT_FILE;
TEMPO=/tmp/tem.po

translate_odt | 
sed 's/<\\\/text:p>/]\n\n\n\n[/g' |
python txtformat.py | 
txt2po - > $TEMPO

po2xliff $TEMPO

rm $TEMPO