CG hybrid tagging

Tagging

The tagger is more robust against missing ambiguity sets. If it encounters a new ambiguity set it picks the a) smallest b) most frequent of them (in that order). This using of the "nearest" ambiguity set is used in other places too.

Apart from feeding in ambiguity sets as is

Tagger training

Both supervised and unsupervised:

Mode Model part	0	1	2	3	4
Ambiguity classes	Dictionary	CG tagged	Dictionary	Dictionary	CG tagged + trimming
Ambiguity class frequency	Untagged	CG tagged	Untagged	Untagged	CG tagged
Corpus	Untagged	CG tagged	CG tagged (nearest)	Mix	CG tagged (nearest)

Note that in the case of supervised training the corpus is used in conjunction with the tagged corpus.

Results

Compare with Comparison of part-of-speech tagging systems.

CG hybrid tagging

Tagging

Tagger training

Results

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools