Comparison of part-of-speech tagging systems

System	Language
	Catalan	Spanish	Serbo-Croatian	Russian	Kazakh	Portuguese	Swedish	Italian
	23,673	20,487	20,071	1,052	13,714	6,725	369	5,201
1st	86.50	90.34	44.99±1.20	38.19	72.08	76.70	34.70	82.28±3.05
Bigram (unsup, 0 iters)	88.96±1.12	88.49±1.54	47.31±1.24			81.41±5.78		79.16±3.12
Bigram (unsup, 50 iters)	91.74±1.15	91.13±1.52	48.28±1.33			81.09±5.99		84.93±2.71
Bigram (unsup, 250 iters)	91.51±1.16	90.85±1.48	48.05±1.47			80.31±6.60		84.52±2.78
Lwsw (0 iters)	92.73±0.89	92.86±0.95	43.56±1.20			83.01±5.47		86.12±2.96
Lwsw (50 iters)	92.98±0.85	93.01±1.02	45.09±1.15			82.70±5.76		86.07±2.68
Lwsw (250 iters)	92.99±0.84	93.06±1.02	45.13±1.17			82.75±5.79		86.08±2.67
CG→1st	88.05	91.10	64.01±1.04	39.81	81.56	87.99	42.90	83.29±3.07
CG→Bigram (unsup, 0 iters)	91.83±1.03	91.39±1.42	60.37±1.45			86.77±6.33		81.31±3.10
CG→Bigram (unsup, 50 iters)	93.16±1.39	92.53±1.29	60.91±1.65			87.48±6.16		86.11±2.46
CG→Bigram (unsup, 250 iters)	92.99±1.38	92.50±1.23	60.88±1.66			87.20±6.72		86.01±2.59
CG→Lwsw (0 iters)	93.17±1.08	92.72±1.09	59.93±1.46			86.60±6.20		85.64±2.83
CG→Lwsw (50 iters)	93.37±1.02	92.74±1.16	60.38±1.57			86.54±6.21		85.55±2.72
CG→Lwsw (250 iters)	93.38±1.05	92.77±1.18	60.42±1.53			86.54±6.20		85.54±2.72
Unigram model 1	93.86±1.13	93.96±0.98	63.96±0.92	39.11±8.91	80.63±3.87	86.00±6.63	46.48±5.78	89.37±1.63
Unigram model 2	93.90±1.09	93.69±0.94	67.51±0.67	40.36±8.59	82.19±3.70	87.13±6.23	47.12±8.29	89.23±0.97
Unigram model 3	93.88±1.08	93.67±0.94	67.47±0.64	40.36±8.59	82.45±3.80	87.11±6.13	47.12±8.29	89.00±0.95
Bigram (sup)	96.00±0.87	95.47±1.07	55.26±0.87			88.07±6.50
CG→Unigram model 1	94.34±1.11	94.73±0.88	68.42±0.69	40.71±9.39	84.54±3.29	88.42±6.55	46.84±5.48	89.04±1.45
CG→Unigram model 2	94.11±1.09	94.33±0.82	68.93±0.72	41.43±9.21	84.62±3.47	88.64±6.13	47.07±7.39	88.67±0.93
CG→Unigram model 3	94.09±1.08	94.31±0.81	68.88±0.72	41.43±9.21	84.71±3.54	88.63±6.07	47.07±7.39	88.45±0.94
CG→Bigram (sup)	96.00±1.13	94.88±1.18	65.66±1.16			88.73±6.36
Percep (coarsebigram)	94.02±1.26	94.79±0.86	55.64±1.17			87.04±6.23		90.87±0.87
Percep (kaztags)	93.66±0.76	94.28±0.93	70.44±0.92		91.41±2.09	87.07±6.16	99.70±0.96	90.64±1.13
Percep (spacycoarsetags)	95.06±1.01	95.23±0.66	56.34±1.21			87.32±6.22		90.96±0.76
Percep (spacyflattags)	95.25±0.85	95.46±0.64	73.02±1.12		91.91±2.13	87.45±6.24	99.70±0.96	90.13±1.37
Percep (unigram)	93.59±0.77	94.09±0.96	70.11±0.97		91.08±2.13	87.16±6.22	99.70±0.96	90.23±0.95
CG→Percep (coarsebigram)	94.01±1.28	94.75±0.69	67.32±0.96			88.70±6.29		89.25±1.17
CG→Percep (kaztags)	93.91±0.90	94.72±0.88	72.79±1.11		87.73±3.12	88.72±6.23	94.34±3.16	89.82±1.29
CG→Percep (spacycoarsetags)	94.93±1.12	95.16±0.78	67.81±1.11			88.83±6.13		89.88±1.03
CG→Percep (spacyflattags)	95.19±0.98	95.40±0.66	72.80±0.76		87.62±2.83	88.85±6.21	94.34±3.16	89.34±1.24
CG→Percep (unigram)	93.87±0.92	94.73±0.77	72.42±0.86		87.52±3.09	88.81±6.28	94.34±3.16	89.39±1.24

In the following table the values represent availability adjusted tagger recall (= [true positives]/[words with a correct analysis from the morphological parser]). This data is also available in box plot form here:

System	Language
	Catalan	Spanish	Serbo-Croatian	Russian	Kazakh	Portuguese	Swedish	Italian
	23,673	20,487	20,071	1,052	13,714	6,725	369	5,201
1st	87.86	91.82	52.56±1.53	75.93	77.72	83.00	64.47	82.77±3.09
Bigram (unsup, 0 iters)	90.35±1.17	89.95±1.45	55.27±1.63			89.72±2.06		79.64±3.11
Bigram (unsup, 50 iters)	93.17±1.21	92.63±1.40	56.40±1.70			89.35±1.99		85.45±2.78
Bigram (unsup, 250 iters)	92.94±1.22	92.35±1.33	56.13±1.87			88.45±2.51		85.03±2.87
Lwsw (0 iters)	94.18±0.91	94.40±0.77	50.88±1.54			91.51±1.22		86.64±3.15
Lwsw (50 iters)	94.44±0.81	94.54±0.83	52.67±1.46			91.14±1.62		86.59±2.82
Lwsw (250 iters)	94.44±0.79	94.60±0.84	52.72±1.50			91.20±1.64		86.60±2.81
CG→1st	89.44	92.60	74.77±1.32	79.10	87.95	95.22	79.70	83.79±3.08
CG→Bigram (unsup, 0 iters)	93.27±1.10	92.90±1.30	70.52±1.71			95.61±1.77		81.80±3.08
CG→Bigram (unsup, 50 iters)	94.62±1.49	94.05±1.13	71.15±1.94			96.41±1.38		86.63±2.51
CG→Bigram (unsup, 250 iters)	94.45±1.48	94.03±1.09	71.11±1.95			96.06±2.05		86.53±2.62
CG→Lwsw (0 iters)	94.63±1.08	94.25±0.91	70.00±1.74			95.43±1.52		86.16±2.97
CG→Lwsw (50 iters)	94.83±1.01	94.27±0.97	70.53±1.86			95.36±1.54		86.07±2.79
CG→Lwsw (250 iters)	94.84±1.03	94.30±0.99	70.58±1.81			95.36±1.53		86.06±2.79
Unigram model 1	95.33±1.05	95.51±0.84	74.72±1.43	77.54±6.51	87.03±3.03	94.74±2.44	89.26±7.32	89.91±1.93
Unigram model 2	95.37±1.04	95.23±0.77	78.87±1.05	80.06±6.11	88.72±2.76	96.01±1.70	89.82±7.70	89.77±1.23
Unigram model 3	95.35±1.03	95.22±0.79	78.82±1.06	80.06±6.11	88.99±2.83	95.99±1.52	89.82±7.70	89.54±1.25
Bigram (sup)	97.50±0.93	97.04±0.86	64.55±1.33			97.03±1.75
CG→Unigram model 1	95.82±1.06	96.30±0.68	79.92±0.95	80.56±6.70	91.25±2.01	97.42±1.76	90.00±6.99	89.58±1.75
CG→Unigram model 2	95.58±1.07	95.89±0.59	80.51±0.95	82.06±6.50	91.33±2.15	97.70±1.32	89.97±7.50	89.21±1.13
CG→Unigram model 3	95.56±1.05	95.86±0.60	80.46±0.99	82.06±6.50	91.43±2.26	97.69±1.28	89.97±7.50	88.98±1.18
CG→Bigram (sup)	97.51±1.21	96.45±0.93	76.70±1.46			97.78±1.52
Percep (coarsebigram)	95.71±1.36	96.60±0.75	61.99±1.24			95.92±1.60		92.89±1.10
Percep (kaztags)	95.34±0.77	96.08±0.69	78.47±0.99		91.41±2.08	95.95±1.69	99.70±0.96	92.67±1.31
Percep (spacycoarsetags)	96.76±1.06	97.05±0.56	62.77±1.29			96.22±1.52		92.99±0.93
Percep (spacyflattags)	96.96±0.87	97.28±0.58	81.35±1.19		91.92±2.12	96.37±1.53	99.70±0.96	92.14±1.44
Percep (unigram)	95.27±0.76	95.89±0.74	78.11±1.03		91.08±2.12	96.05±1.64	99.70±0.96	92.24±1.11
CG→Percep (coarsebigram)	95.70±1.37	96.55±0.55	75.00±1.04			97.75±1.47		91.25±1.50
CG→Percep (kaztags)	95.59±0.92	96.53±0.66	81.10±1.20		87.74±3.11	97.78±1.41	94.34±3.16	91.83±1.50
CG→Percep (spacycoarsetags)	96.64±1.17	96.98±0.64	75.54±1.31			97.90±1.30		91.89±1.20
CG→Percep (spacyflattags)	96.90±1.02	97.22±0.51	81.10±0.86		87.62±2.82	97.92±1.38	94.34±3.16	91.34±1.42
CG→Percep (unigram)	95.55±0.92	96.54±0.52	80.68±0.93		87.52±3.08	97.87±1.47	94.34±3.16	91.38±1.40

In the following table, the intervals represent the [low, high] values from 10-fold cross validation.

Language	Corpus			System
Language	Sent	Tok	Amb	1st	CG+1st	Unigram	CG+Unigram	apertium-tagger	CG+apertium-tagger
Catalan	1,413	24,144	?	81.85	83.96	[75.65, 78.46]	[87.76, 90.48]	[94.16, 96.28]	[93.92, 96.16]
Spanish	1,271	21,247	?	86.18	86.71	[78.20, 80.06]	[87.72, 90.27]	[90.15, 94.86]	[91.84, 93.70]
Serbo-Croatian	1,190	20,128	?	75.22	79.67	[75.36, 78.79]	[75.36, 77.28]
Russian	451	10,171	?	75.63	79.52	[70.49, 72.94]	[74.68, 78.65]	n/a	n/a
Kazakh	403	4,348	?	80.79	86.19	[84.36, 87.79]	[85.56, 88.72]	n/a	n/a
Portuguese	119	3,823	?	72.54	87.34	[77.10, 87.72]	[84.05, 91.96]
Swedish	11	239	?	72.90	73.86	[56.00, 82.97]

Sent = sentences, Tok = tokens, Amb = average ambiguity from the morphological analyser

Systems[edit]

1st: Selects the first analysis from the morphological analyser
CG: Uses the CG (from the monolingual language package in languages) to preprocess the input.
Unigram: Lexicalised unigram tagger
apertium-tagger: Uses the bigram HMM tagger included with Apertium.

Corpora[edit]

The tagged corpora used in the experiments are found in the monolingual packages in languages, under the texts/ subdirectory.

Comparison of part-of-speech tagging systems

Contents

Systems[edit]

Corpora[edit]

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools