Difference between revisions of "Part-of-speech tagging"

Revision as of 08:28, 16 September 2008

Part-of-speech tagging is the process of assigning unambiguous grammatical categories^[1] to words in context. The crux of the problem is that surface forms of words can often be assigned more than one part-of-speech by morphological analysis. For example in English, the word "trap" can be both a singular noun ("a trap") or a verb ("I'll trap it").

This page intends to give an overview of how part-of-speech tagging works in Apertium, primarily within the apertium-tagger, but giving a short overview of constraints (as in constraint grammar) and restrictions (as in apertium-tagger) as well.

Lexical ambiguity

After morphological analysis of a sentence, a not insignificant amount of words will have more than one analysis. For example in the following sentence:

Vino (noun or verb) a (preposition) la (determiner or pronoun) playa (noun)

Hidden Markov models

A hidden Markov model is a statistical model which consists of a number of hidden states, and a number of observable states.

Ambiguity classes

Training

Expectation-Maximisation (EM)

Baum-Welch

Tagging

Viterbi

Notes

↑ Also referred to as "parts-of-speech", e.g. Noun, Verb, Adjective, Adverb, Conjunction, etc.

[1] Also referred to as "parts-of-speech", e.g. Noun, Verb, Adjective, Adverb, Conjunction, etc.

[1]

Revision as of 06:58, 16 September 2008 (edit) Francis Tyers (talk \| contribs) ← Older edit		Revision as of 08:28, 16 September 2008 (edit) (undo) Francis Tyers (talk \| contribs) Newer edit →
Line 9:		Line 9:
	After morphological analysis of a sentence, a not insignificant amount of words will have more than one analysis. For example in the following sentence:		After morphological analysis of a sentence, a not insignificant amount of words will have more than one analysis. For example in the following sentence:

−	:Vino (noun or verb) a ( la playa	+	:Vino (<code>noun</code> or <code>verb</code>) a (<code>preposition</code>) la (<code>determiner</code> or <code>pronoun</code>) playa (<code>noun</code>)
		+
		+

Difference between revisions of "Part-of-speech tagging"

Revision as of 08:28, 16 September 2008

Contents

Lexical ambiguity

Hidden Markov models

Ambiguity classes

Training

Expectation-Maximisation (EM)

Baum-Welch

Tagging

Viterbi

See also

Notes

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools