Perceptron tagger

From Apertium
Revision as of 12:18, 22 August 2016 by Frankier (talk | contribs) (Created page with "== Step by step == Mostly things are as in Supervised tagger training except you need an MTX file (and optionally a TSX file) instead of a TSX file. 1a. '''Get an MTX fi...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Step by step

Mostly things are as in Supervised tagger training except you need an MTX file (and optionally a TSX file) instead of a TSX file.

1a. Get an MTX file Copy an MTX file into your language directory and optionally modify it (or start from scratch). See MTX format. 1b. Get a tagged corpus.

2. Train the tagger like so: apertium-tagger [--skip-on-error] -xs [ITERATIONS] TAGGED_CORPUS UNTAGGED_CORPUS MTX_FILE You can put this in a Makefile. Use --skip-on-error to discard sentences for which the TAGGED and UNTAGGED corpus don't really match. 10 is a good value for ITERATIONS.

3. Run the tagger like so: apertium-tagger --tagger --perceptron model. You can put this in your modes.xml.