Difference between revisions of "Tagger"

Latest revision as of 01:59, 24 January 2020

Tagger is usually short for part-of-speech tagger, a program which takes an ambiguous sequence of morphologically analysed text and chooses the most probable analysis.

Given the following ambiguous input (from "tengo una idea")

^tengo/tener<vblex><pri><p1><sg>$ ^una/uno<prn><tn><f><sg>/uno<det><ind><f><sg>/unir<vblex><prs><p3><sg>/unir<vblex><prs><p1><sg>/unir<vblex><imp><p3><sg>$ ^idea/idea<n><f><sg>/idear<vblex><pri><p3><sg>/idear<vblex><imp><p2><sg>$

a good tagger would end up with

^tener<vblex><pri><p1><sg>$ ^uno<det><ind><f><sg>$ ^idea<n><f><sg>$

The program apertium-tagger achieves this by using a Hidden Markov Model, a statistical model using bigrams (trigram training is also possible). Training of apertium-tagger can be supervised or unsupervised; there is also target-language tagger training where training is based on how good the translations given by the tagging are, using a target-language language model. If a certain bigram sequence is impossible, one may explicitly tell the tagger this with FORBID or ENFORCE rules.

Some language pairs use Constraint Grammar (CG) to remove more readings before apertium-tagger; CG lets you write rule-based taggers which allows more complex rules.

Revision as of 10:44, 24 March 2012 (edit) Bech (talk \| contribs) (Category:Documentation in English) ← Older edit		Latest revision as of 01:59, 24 January 2020 (edit) (undo) ScoopGracie (talk \| contribs)
(One intermediate revision by one other user not shown)
Line 1:		Line 1:
			{{otherlang\|Tagger (français)\|{{french}}}}

	'''Tagger''' is usually short for part-of-speech tagger, a program which takes an ambiguous sequence of morphologically analysed text and chooses the most probable analysis.		'''Tagger''' is usually short for part-of-speech tagger, a program which takes an ambiguous sequence of morphologically analysed text and chooses the most probable analysis.

Difference between revisions of "Tagger"

Latest revision as of 01:59, 24 January 2020

See also[edit]

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools