Difference between revisions of "Part-of-speech tagging"
Line 4: | Line 4: | ||
This page intends to give an overview of how part-of-speech tagging works in Apertium, primarily within the <code>apertium-tagger</code>, but giving a short overview of constraints (as in [[constraint grammar]]) and restrictions (as in <code>apertium-tagger</code>) as well. |
This page intends to give an overview of how part-of-speech tagging works in Apertium, primarily within the <code>apertium-tagger</code>, but giving a short overview of constraints (as in [[constraint grammar]]) and restrictions (as in <code>apertium-tagger</code>) as well. |
||
+ | |||
+ | ==Lexical ambiguity== |
||
+ | |||
+ | After morphological analysis of a sentence, a not insignificant amount of words will have more than one analysis. For example in the following sentence: |
||
+ | |||
+ | :Vino (noun or verb) a ( la playa |
||
+ | |||
==Hidden Markov models== |
==Hidden Markov models== |
||
− | A hidden Markov model is a statistical model of |
+ | A hidden Markov model is a statistical model which consists of a number of hidden states, and a number of observable states. |
===Ambiguity classes=== |
===Ambiguity classes=== |
||
Line 20: | Line 27: | ||
===Viterbi=== |
===Viterbi=== |
||
+ | |||
+ | ==See also== |
||
+ | |||
+ | * [[Tagger training]] |
||
+ | * [[Constraint grammar]] |
||
==Notes== |
==Notes== |
Revision as of 06:58, 16 September 2008
Part-of-speech tagging is the process of assigning unambiguous grammatical categories[1] to words in context. The crux of the problem is that surface forms of words can often be assigned more than one part-of-speech by morphological analysis. For example in English, the word "trap" can be both a singular noun ("a trap") or a verb ("I'll trap it").
This page intends to give an overview of how part-of-speech tagging works in Apertium, primarily within the apertium-tagger
, but giving a short overview of constraints (as in constraint grammar) and restrictions (as in apertium-tagger
) as well.
Lexical ambiguity
After morphological analysis of a sentence, a not insignificant amount of words will have more than one analysis. For example in the following sentence:
- Vino (noun or verb) a ( la playa
Hidden Markov models
A hidden Markov model is a statistical model which consists of a number of hidden states, and a number of observable states.
Ambiguity classes
Training
Expectation-Maximisation (EM)
Baum-Welch
Tagging
Viterbi
See also
Notes
- ↑ Also referred to as "parts-of-speech", e.g. Noun, Verb, Adjective, Adverb, Conjunction, etc.