Apertium has moved from SourceForge to GitHub.
If you have any questions, please come and talk to us on #apertium on irc.freenode.net or contact the GitHub migration team.

ACX format

From Apertium
(Difference between revisions)
Jump to: navigation, search
(New page: The '''ACX format''' is used for describing equivalent characters in monodices. If a language has multiple methods of writing a character, for example with Romanian ș and ş, ...)

Revision as of 14:41, 18 April 2008

The ACX format is used for describing equivalent characters in monodices. If a language has multiple methods of writing a character, for example with Romanian ș and ş, then you can use the file to define them as being equivalent. It can also be used in languages where the apostrophe is grammatically important (e.g. Catalan) to make sure that several different variants are accepted.

The format is defined in the file acx.rng which can be found in both the lttoolbox and apertium modules in SVN.

Example file

The file apertium-es-ro.ro.acx from apertium-es-ro.

<?xml version="1.0"?>
<analysis-chars>
  <!-- Make apostrophe variants equal ' -->
  <char value="'">
    <equiv-char value="’"/>
    <equiv-char value="ʼ"/>
  </char>

  <!-- Legacy values for characters with comma -->
  <char value="ț">
    <equiv-char value="ţ"/>
  </char>
  <char value="Ț">
    <equiv-char value="Ţ"/>
  </char>
  <char value="ș">
    <equiv-char value="ş"/>
  </char>
  <char value="Ș">
    <equiv-char value="Ş"/>
  </char>

  <!-- Orthographic variant -->
  <char value="â">
    <equiv-char value="î"/>
  </char>
</analysis-chars>
Personal tools