Perceptron tagger
Revision as of 12:18, 22 August 2016 by Frankier (talk | contribs) (Created page with "== Step by step == Mostly things are as in Supervised tagger training except you need an MTX file (and optionally a TSX file) instead of a TSX file. 1a. '''Get an MTX fi...")
Step by step
Mostly things are as in Supervised tagger training except you need an MTX file (and optionally a TSX file) instead of a TSX file.
1a. Get an MTX file Copy an MTX file into your language directory and optionally modify it (or start from scratch). See MTX format. 1b. Get a tagged corpus.
2. Train the tagger like so: apertium-tagger [--skip-on-error] -xs [ITERATIONS] TAGGED_CORPUS UNTAGGED_CORPUS MTX_FILE You can put this in a Makefile. Use --skip-on-error to discard sentences for which the TAGGED and UNTAGGED corpus don't really match. 10 is a good value for ITERATIONS.
3. Run the tagger like so: apertium-tagger --tagger --perceptron model. You can put this in your modes.xml.