Part-of-speech tagging

From Apertium
Revision as of 13:34, 3 September 2008 by Francis Tyers (talk | contribs)
Jump to navigation Jump to search

Part-of-speech tagging is the process of assigning unambiguous grammatical categories[1] to words in context. The crux of the problem is that surface forms of words can often be assigned more than one part-of-speech by morphological analysis. For example in English, the word "trap" can be both a singular noun ("a trap") or a verb ("I'll trap it").

This page intends to give an overview of how part-of-speech tagging works in Apertium, primarily within the apertium-tagger, but giving a short overview of constraints (as in constraint grammar) and restrictions (as in apertium-tagger) as well.

Hidden Markov models

A hidden Markov model is a statistical model of .......

Ambiguity classes

Training

Expectation-Maximisation (EM)

Baum-Welch

Tagging

Viterbi

Notes

  1. Also referred to as "parts-of-speech", e.g. Noun, Verb, Adjective, Adverb, Conjunction, etc.