Apertium has moved from SourceForge to GitHub.
If you have any questions, please come and talk to us on #apertium on irc.freenode.net or contact the GitHub migration team.

MTX format

From Apertium
Revision as of 14:24, 22 August 2016 by Frankier (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

This page serves a reference to the MTX format. The MTX format describes features to be used by the Perceptron tagger.

Example

Here is an example of the basic outline of an MTX file to illustrate the structure and some common constructs:

<?xml version="1.0" ?>①
<!DOCTYPE metatag [
  <!ENTITY commondefns SYSTEM "commondefns.mtx">②
]>
<!-- Comment -->③
<metatag>
  <coarse-tags tag="mytsx.tsx" />④
  <beam-width val="10" />⑤
  <defns>⑥
    &commondefns;②
    <def-str name="plus" val="+" />
    <def-macro name="foo">
      ...
    </def-macro>
    ...
  </defns>
  <feats>⑦
    <!-- Major tag (all wordoids) -->
    <feat>⑧
      ...
      <pred>...</pred>
      <out>
        <macro name="foo"></macro>
        ...
      </out>
      <out-many>...</out-many>
    </feat>
  </feats>
</metatag>

  1. The format is an XML format.
  2. So files can be included using XML entities as illustrated.
  3. And XML comments can be used.
  4. If you want to make use of coarse tags you can reference a TSX file using a relative file path.
  5. You can change the beam width of used in decoding with this tag.
  6. The defns section contains constants and macros.
  7. The feats section contains feature definitions
  8. Each feature definition can contain many boolean predicates with <pred>, normal output with <out> and generation of many features from an array type with <out-many>

Tag reference

Personal tools