If you have any questions, please come and talk to us on
#apertium
on irc.freenode.net
or contact the GitHub migration team.Comparison of part-of-speech tagging systems
(8 intermediate revisions by 3 users not shown) | |||
Line 2: | Line 2: | ||
Apertium would like to have really good part-of-speech tagging, but in many cases falls below the state-of-the-art (around 97% tagging accuracy). This page intends to collect a comparison of tagging systems in Apertium and give some ideas of what could be done to improve them. |
Apertium would like to have really good part-of-speech tagging, but in many cases falls below the state-of-the-art (around 97% tagging accuracy). This page intends to collect a comparison of tagging systems in Apertium and give some ideas of what could be done to improve them. |
||
+ | |||
+ | The scripts to generate these results are written in Python and available from SVN, /branches/apertium-tagger/experiments/ : https://svn.code.sf.net/p/apertium/svn/branches/apertium-tagger/experiments/ |
||
In the following two tables, values of the form x±y are the sample mean and standard deviation of the results of 10-fold cross validation. |
In the following two tables, values of the form x±y are the sample mean and standard deviation of the results of 10-fold cross validation. |
||
Line 57: | Line 59: | ||
|- |
|- |
||
| '''CG→Bigram (sup)''' ||align=right| 96.00±1.13 ||align=right| 94.88±1.18 ||align=right| 65.66±1.16 ||||||align=right| 88.73±6.36 |
| '''CG→Bigram (sup)''' ||align=right| 96.00±1.13 ||align=right| 94.88±1.18 ||align=right| 65.66±1.16 ||||||align=right| 88.73±6.36 |
||
+ | |- |
||
+ | | '''Percep (coarsebigram)''' ||align=right| 94.02±1.26 ||align=right| 94.79±0.86 ||align=right| 55.64±1.17 ||||||align=right| 87.04±6.23 ||||align=right| 90.87±0.87 |
||
+ | |- |
||
+ | | '''Percep (kaztags)''' ||align=right| 93.66±0.76 ||align=right| 94.28±0.93 ||align=right| 70.44±0.92 ||||align=right| 91.41±2.09 ||align=right| 87.07±6.16 ||align=right| 99.70±0.96 ||align=right| 90.64±1.13 |
||
+ | |- |
||
+ | | '''Percep (spacycoarsetags)''' ||align=right| 95.06±1.01 ||align=right| 95.23±0.66 ||align=right| 56.34±1.21 ||||||align=right| 87.32±6.22 ||||align=right| 90.96±0.76 |
||
+ | |- |
||
+ | | '''Percep (spacyflattags)''' ||align=right| 95.25±0.85 ||align=right| 95.46±0.64 ||align=right| 73.02±1.12 ||||align=right| 91.91±2.13 ||align=right| 87.45±6.24 ||align=right| 99.70±0.96 ||align=right| 90.13±1.37 |
||
+ | |- |
||
+ | | '''Percep (unigram)''' ||align=right| 93.59±0.77 ||align=right| 94.09±0.96 ||align=right| 70.11±0.97 ||||align=right| 91.08±2.13 ||align=right| 87.16±6.22 ||align=right| 99.70±0.96 ||align=right| 90.23±0.95 |
||
+ | |- |
||
+ | | '''CG→Percep (coarsebigram)''' ||align=right| 94.01±1.28 ||align=right| 94.75±0.69 ||align=right| 67.32±0.96 ||||||align=right| 88.70±6.29 ||||align=right| 89.25±1.17 |
||
+ | |- |
||
+ | | '''CG→Percep (kaztags)''' ||align=right| 93.91±0.90 ||align=right| 94.72±0.88 ||align=right| 72.79±1.11 ||||align=right| 87.73±3.12 ||align=right| 88.72±6.23 ||align=right| 94.34±3.16 ||align=right| 89.82±1.29 |
||
+ | |- |
||
+ | | '''CG→Percep (spacycoarsetags)''' ||align=right| 94.93±1.12 ||align=right| 95.16±0.78 ||align=right| 67.81±1.11 ||||||align=right| 88.83±6.13 ||||align=right| 89.88±1.03 |
||
+ | |- |
||
+ | | '''CG→Percep (spacyflattags)''' ||align=right| 95.19±0.98 ||align=right| 95.40±0.66 ||align=right| 72.80±0.76 ||||align=right| 87.62±2.83 ||align=right| 88.85±6.21 ||align=right| 94.34±3.16 ||align=right| 89.34±1.24 |
||
+ | |- |
||
+ | | '''CG→Percep (unigram)''' ||align=right| 93.87±0.92 ||align=right| 94.73±0.77 ||align=right| 72.42±0.86 ||||align=right| 87.52±3.09 ||align=right| 88.81±6.28 ||align=right| 94.34±3.16 ||align=right| 89.39±1.24 |
||
|} |
|} |
||
Line 110: | Line 132: | ||
| '''CG→Unigram model 3''' ||align=right| 95.56±1.05 ||align=right| 95.86±0.60 ||align=right| 80.46±0.99 ||align=right| 82.06±6.50 ||align=right| 91.43±2.26 ||align=right| 97.69±1.28 ||align=right| 89.97±7.50 ||align=right| 88.98±1.18 |
| '''CG→Unigram model 3''' ||align=right| 95.56±1.05 ||align=right| 95.86±0.60 ||align=right| 80.46±0.99 ||align=right| 82.06±6.50 ||align=right| 91.43±2.26 ||align=right| 97.69±1.28 ||align=right| 89.97±7.50 ||align=right| 88.98±1.18 |
||
|- |
|- |
||
− | | '''CG→Bigram (sup)''' ||align=right| 97.51±1.21 ||align=right| 96.45±0.93 ||align=right| 76.70±1.46 ||||||align=right| 97.78±1.52 |
+ | | '''CG→Bigram (sup)''' ||align=right| '''97.51'''±1.21 ||align=right| 96.45±0.93 ||align=right| 76.70±1.46 ||||||align=right| 97.78±1.52 |
+ | |- |
||
+ | | '''Percep (coarsebigram)''' ||align=right| 95.71±1.36 ||align=right| 96.60±0.75 ||align=right| 61.99±1.24 ||||||align=right| 95.92±1.60 ||||align=right| 92.89±1.10 |
||
+ | |- |
||
+ | | '''Percep (kaztags)''' ||align=right| 95.34±0.77 ||align=right| 96.08±0.69 ||align=right| 78.47±0.99 ||||align=right| 91.41±2.08 ||align=right| 95.95±1.69 ||align=right| '''99.70'''±0.96 ||align=right| 92.67±1.31 |
||
+ | |- |
||
+ | | '''Percep (spacycoarsetags)''' ||align=right| 96.76±1.06 ||align=right| 97.05±0.56 ||align=right| 62.77±1.29 ||||||align=right| 96.22±1.52 ||||align=right| '''92.99'''±0.93 |
||
+ | |- |
||
+ | | '''Percep (spacyflattags)''' ||align=right| 96.96±0.87 ||align=right| '''97.28'''±0.58 ||align=right| '''81.35'''±1.19 ||||align=right| '''91.92'''±2.12 ||align=right| 96.37±1.53 ||align=right| '''99.70'''±0.96 ||align=right| 92.14±1.44 |
||
+ | |- |
||
+ | | '''Percep (unigram)''' ||align=right| 95.27±0.76 ||align=right| 95.89±0.74 ||align=right| 78.11±1.03 ||||align=right| 91.08±2.12 ||align=right| 96.05±1.64 ||align=right| '''99.70'''±0.96 ||align=right| 92.24±1.11 |
||
+ | |- |
||
+ | | '''CG→Percep (coarsebigram)''' ||align=right| 95.70±1.37 ||align=right| 96.55±0.55 ||align=right| 75.00±1.04 ||||||align=right| 97.75±1.47 ||||align=right| 91.25±1.50 |
||
+ | |- |
||
+ | | '''CG→Percep (kaztags)''' ||align=right| 95.59±0.92 ||align=right| 96.53±0.66 ||align=right| 81.10±1.20 ||||align=right| 87.74±3.11 ||align=right| 97.78±1.41 ||align=right| 94.34±3.16 ||align=right| 91.83±1.50 |
||
+ | |- |
||
+ | | '''CG→Percep (spacycoarsetags)''' ||align=right| 96.64±1.17 ||align=right| 96.98±0.64 ||align=right| 75.54±1.31 ||||||align=right| '''97.90'''±1.30 ||||align=right| 91.89±1.20 |
||
+ | |- |
||
+ | | '''CG→Percep (spacyflattags)''' ||align=right| 96.90±1.02 ||align=right| 97.22±0.51 ||align=right| 81.10±0.86 ||||align=right| 87.62±2.82 ||align=right| 97.92±1.38 ||align=right| 94.34±3.16 ||align=right| 91.34±1.42 |
||
+ | |- |
||
+ | | '''CG→Percep (unigram)''' ||align=right| 95.55±0.92 ||align=right| 96.54±0.52 ||align=right| 80.68±0.93 ||||align=right| 87.52±3.08 ||align=right| 97.87±1.47 ||align=right| 94.34±3.16 ||align=right| 91.38±1.40 |
||
|} |
|} |
||
Line 148: | Line 170: | ||
The tagged corpora used in the experiments are found in the monolingual packages in [[languages]], under the <code>texts/</code> subdirectory. |
The tagged corpora used in the experiments are found in the monolingual packages in [[languages]], under the <code>texts/</code> subdirectory. |
||
− | |||
− | ==Todo== |
||
− | |||
− | * Implement this tagger: https://spacy.io/blog/part-of-speech-POS-tagger-in-python |
||
[[Category:Tools]] |
[[Category:Tools]] |
||
+ | [[Category:Documentation]] |
||
+ | [[Category:Documentation in English]] |
Latest revision as of 17:50, 22 August 2017
|
Apertium would like to have really good part-of-speech tagging, but in many cases falls below the state-of-the-art (around 97% tagging accuracy). This page intends to collect a comparison of tagging systems in Apertium and give some ideas of what could be done to improve them.
The scripts to generate these results are written in Python and available from SVN, /branches/apertium-tagger/experiments/ : https://svn.code.sf.net/p/apertium/svn/branches/apertium-tagger/experiments/
In the following two tables, values of the form x±y are the sample mean and standard deviation of the results of 10-fold cross validation.
In the following table the values represent tagger recall (= [true positives]/[total tokens]):
System | Language | |||||||
---|---|---|---|---|---|---|---|---|
Catalan | Spanish | Serbo-Croatian | Russian | Kazakh | Portuguese | Swedish | Italian | |
23,673 | 20,487 | 20,071 | 1,052 | 13,714 | 6,725 | 369 | 5,201 | |
1st | 86.50 | 90.34 | 44.99±1.20 | 38.19 | 72.08 | 76.70 | 34.70 | 82.28±3.05 |
Bigram (unsup, 0 iters) | 88.96±1.12 | 88.49±1.54 | 47.31±1.24 | 81.41±5.78 | 79.16±3.12 | |||
Bigram (unsup, 50 iters) | 91.74±1.15 | 91.13±1.52 | 48.28±1.33 | 81.09±5.99 | 84.93±2.71 | |||
Bigram (unsup, 250 iters) | 91.51±1.16 | 90.85±1.48 | 48.05±1.47 | 80.31±6.60 | 84.52±2.78 | |||
Lwsw (0 iters) | 92.73±0.89 | 92.86±0.95 | 43.56±1.20 | 83.01±5.47 | 86.12±2.96 | |||
Lwsw (50 iters) | 92.98±0.85 | 93.01±1.02 | 45.09±1.15 | 82.70±5.76 | 86.07±2.68 | |||
Lwsw (250 iters) | 92.99±0.84 | 93.06±1.02 | 45.13±1.17 | 82.75±5.79 | 86.08±2.67 | |||
CG→1st | 88.05 | 91.10 | 64.01±1.04 | 39.81 | 81.56 | 87.99 | 42.90 | 83.29±3.07 |
CG→Bigram (unsup, 0 iters) | 91.83±1.03 | 91.39±1.42 | 60.37±1.45 | 86.77±6.33 | 81.31±3.10 | |||
CG→Bigram (unsup, 50 iters) | 93.16±1.39 | 92.53±1.29 | 60.91±1.65 | 87.48±6.16 | 86.11±2.46 | |||
CG→Bigram (unsup, 250 iters) | 92.99±1.38 | 92.50±1.23 | 60.88±1.66 | 87.20±6.72 | 86.01±2.59 | |||
CG→Lwsw (0 iters) | 93.17±1.08 | 92.72±1.09 | 59.93±1.46 | 86.60±6.20 | 85.64±2.83 | |||
CG→Lwsw (50 iters) | 93.37±1.02 | 92.74±1.16 | 60.38±1.57 | 86.54±6.21 | 85.55±2.72 | |||
CG→Lwsw (250 iters) | 93.38±1.05 | 92.77±1.18 | 60.42±1.53 | 86.54±6.20 | 85.54±2.72 | |||
Unigram model 1 | 93.86±1.13 | 93.96±0.98 | 63.96±0.92 | 39.11±8.91 | 80.63±3.87 | 86.00±6.63 | 46.48±5.78 | 89.37±1.63 |
Unigram model 2 | 93.90±1.09 | 93.69±0.94 | 67.51±0.67 | 40.36±8.59 | 82.19±3.70 | 87.13±6.23 | 47.12±8.29 | 89.23±0.97 |
Unigram model 3 | 93.88±1.08 | 93.67±0.94 | 67.47±0.64 | 40.36±8.59 | 82.45±3.80 | 87.11±6.13 | 47.12±8.29 | 89.00±0.95 |
Bigram (sup) | 96.00±0.87 | 95.47±1.07 | 55.26±0.87 | 88.07±6.50 | ||||
CG→Unigram model 1 | 94.34±1.11 | 94.73±0.88 | 68.42±0.69 | 40.71±9.39 | 84.54±3.29 | 88.42±6.55 | 46.84±5.48 | 89.04±1.45 |
CG→Unigram model 2 | 94.11±1.09 | 94.33±0.82 | 68.93±0.72 | 41.43±9.21 | 84.62±3.47 | 88.64±6.13 | 47.07±7.39 | 88.67±0.93 |
CG→Unigram model 3 | 94.09±1.08 | 94.31±0.81 | 68.88±0.72 | 41.43±9.21 | 84.71±3.54 | 88.63±6.07 | 47.07±7.39 | 88.45±0.94 |
CG→Bigram (sup) | 96.00±1.13 | 94.88±1.18 | 65.66±1.16 | 88.73±6.36 | ||||
Percep (coarsebigram) | 94.02±1.26 | 94.79±0.86 | 55.64±1.17 | 87.04±6.23 | 90.87±0.87 | |||
Percep (kaztags) | 93.66±0.76 | 94.28±0.93 | 70.44±0.92 | 91.41±2.09 | 87.07±6.16 | 99.70±0.96 | 90.64±1.13 | |
Percep (spacycoarsetags) | 95.06±1.01 | 95.23±0.66 | 56.34±1.21 | 87.32±6.22 | 90.96±0.76 | |||
Percep (spacyflattags) | 95.25±0.85 | 95.46±0.64 | 73.02±1.12 | 91.91±2.13 | 87.45±6.24 | 99.70±0.96 | 90.13±1.37 | |
Percep (unigram) | 93.59±0.77 | 94.09±0.96 | 70.11±0.97 | 91.08±2.13 | 87.16±6.22 | 99.70±0.96 | 90.23±0.95 | |
CG→Percep (coarsebigram) | 94.01±1.28 | 94.75±0.69 | 67.32±0.96 | 88.70±6.29 | 89.25±1.17 | |||
CG→Percep (kaztags) | 93.91±0.90 | 94.72±0.88 | 72.79±1.11 | 87.73±3.12 | 88.72±6.23 | 94.34±3.16 | 89.82±1.29 | |
CG→Percep (spacycoarsetags) | 94.93±1.12 | 95.16±0.78 | 67.81±1.11 | 88.83±6.13 | 89.88±1.03 | |||
CG→Percep (spacyflattags) | 95.19±0.98 | 95.40±0.66 | 72.80±0.76 | 87.62±2.83 | 88.85±6.21 | 94.34±3.16 | 89.34±1.24 | |
CG→Percep (unigram) | 93.87±0.92 | 94.73±0.77 | 72.42±0.86 | 87.52±3.09 | 88.81±6.28 | 94.34±3.16 | 89.39±1.24 |
In the following table the values represent availability adjusted tagger recall (= [true positives]/[words with a correct analysis from the morphological parser]). This data is also available in box plot form here:
System | Language | |||||||
---|---|---|---|---|---|---|---|---|
Catalan | Spanish | Serbo-Croatian | Russian | Kazakh | Portuguese | Swedish | Italian | |
23,673 | 20,487 | 20,071 | 1,052 | 13,714 | 6,725 | 369 | 5,201 | |
1st | 87.86 | 91.82 | 52.56±1.53 | 75.93 | 77.72 | 83.00 | 64.47 | 82.77±3.09 |
Bigram (unsup, 0 iters) | 90.35±1.17 | 89.95±1.45 | 55.27±1.63 | 89.72±2.06 | 79.64±3.11 | |||
Bigram (unsup, 50 iters) | 93.17±1.21 | 92.63±1.40 | 56.40±1.70 | 89.35±1.99 | 85.45±2.78 | |||
Bigram (unsup, 250 iters) | 92.94±1.22 | 92.35±1.33 | 56.13±1.87 | 88.45±2.51 | 85.03±2.87 | |||
Lwsw (0 iters) | 94.18±0.91 | 94.40±0.77 | 50.88±1.54 | 91.51±1.22 | 86.64±3.15 | |||
Lwsw (50 iters) | 94.44±0.81 | 94.54±0.83 | 52.67±1.46 | 91.14±1.62 | 86.59±2.82 | |||
Lwsw (250 iters) | 94.44±0.79 | 94.60±0.84 | 52.72±1.50 | 91.20±1.64 | 86.60±2.81 | |||
CG→1st | 89.44 | 92.60 | 74.77±1.32 | 79.10 | 87.95 | 95.22 | 79.70 | 83.79±3.08 |
CG→Bigram (unsup, 0 iters) | 93.27±1.10 | 92.90±1.30 | 70.52±1.71 | 95.61±1.77 | 81.80±3.08 | |||
CG→Bigram (unsup, 50 iters) | 94.62±1.49 | 94.05±1.13 | 71.15±1.94 | 96.41±1.38 | 86.63±2.51 | |||
CG→Bigram (unsup, 250 iters) | 94.45±1.48 | 94.03±1.09 | 71.11±1.95 | 96.06±2.05 | 86.53±2.62 | |||
CG→Lwsw (0 iters) | 94.63±1.08 | 94.25±0.91 | 70.00±1.74 | 95.43±1.52 | 86.16±2.97 | |||
CG→Lwsw (50 iters) | 94.83±1.01 | 94.27±0.97 | 70.53±1.86 | 95.36±1.54 | 86.07±2.79 | |||
CG→Lwsw (250 iters) | 94.84±1.03 | 94.30±0.99 | 70.58±1.81 | 95.36±1.53 | 86.06±2.79 | |||
Unigram model 1 | 95.33±1.05 | 95.51±0.84 | 74.72±1.43 | 77.54±6.51 | 87.03±3.03 | 94.74±2.44 | 89.26±7.32 | 89.91±1.93 |
Unigram model 2 | 95.37±1.04 | 95.23±0.77 | 78.87±1.05 | 80.06±6.11 | 88.72±2.76 | 96.01±1.70 | 89.82±7.70 | 89.77±1.23 |
Unigram model 3 | 95.35±1.03 | 95.22±0.79 | 78.82±1.06 | 80.06±6.11 | 88.99±2.83 | 95.99±1.52 | 89.82±7.70 | 89.54±1.25 |
Bigram (sup) | 97.50±0.93 | 97.04±0.86 | 64.55±1.33 | 97.03±1.75 | ||||
CG→Unigram model 1 | 95.82±1.06 | 96.30±0.68 | 79.92±0.95 | 80.56±6.70 | 91.25±2.01 | 97.42±1.76 | 90.00±6.99 | 89.58±1.75 |
CG→Unigram model 2 | 95.58±1.07 | 95.89±0.59 | 80.51±0.95 | 82.06±6.50 | 91.33±2.15 | 97.70±1.32 | 89.97±7.50 | 89.21±1.13 |
CG→Unigram model 3 | 95.56±1.05 | 95.86±0.60 | 80.46±0.99 | 82.06±6.50 | 91.43±2.26 | 97.69±1.28 | 89.97±7.50 | 88.98±1.18 |
CG→Bigram (sup) | 97.51±1.21 | 96.45±0.93 | 76.70±1.46 | 97.78±1.52 | ||||
Percep (coarsebigram) | 95.71±1.36 | 96.60±0.75 | 61.99±1.24 | 95.92±1.60 | 92.89±1.10 | |||
Percep (kaztags) | 95.34±0.77 | 96.08±0.69 | 78.47±0.99 | 91.41±2.08 | 95.95±1.69 | 99.70±0.96 | 92.67±1.31 | |
Percep (spacycoarsetags) | 96.76±1.06 | 97.05±0.56 | 62.77±1.29 | 96.22±1.52 | 92.99±0.93 | |||
Percep (spacyflattags) | 96.96±0.87 | 97.28±0.58 | 81.35±1.19 | 91.92±2.12 | 96.37±1.53 | 99.70±0.96 | 92.14±1.44 | |
Percep (unigram) | 95.27±0.76 | 95.89±0.74 | 78.11±1.03 | 91.08±2.12 | 96.05±1.64 | 99.70±0.96 | 92.24±1.11 | |
CG→Percep (coarsebigram) | 95.70±1.37 | 96.55±0.55 | 75.00±1.04 | 97.75±1.47 | 91.25±1.50 | |||
CG→Percep (kaztags) | 95.59±0.92 | 96.53±0.66 | 81.10±1.20 | 87.74±3.11 | 97.78±1.41 | 94.34±3.16 | 91.83±1.50 | |
CG→Percep (spacycoarsetags) | 96.64±1.17 | 96.98±0.64 | 75.54±1.31 | 97.90±1.30 | 91.89±1.20 | |||
CG→Percep (spacyflattags) | 96.90±1.02 | 97.22±0.51 | 81.10±0.86 | 87.62±2.82 | 97.92±1.38 | 94.34±3.16 | 91.34±1.42 | |
CG→Percep (unigram) | 95.55±0.92 | 96.54±0.52 | 80.68±0.93 | 87.52±3.08 | 97.87±1.47 | 94.34±3.16 | 91.38±1.40 |
In the following table, the intervals represent the [low, high] values from 10-fold cross validation.
Language | Corpus | System | |||||||
---|---|---|---|---|---|---|---|---|---|
Sent | Tok | Amb | 1st | CG+1st | Unigram | CG+Unigram | apertium-tagger | CG+apertium-tagger | |
Catalan | 1,413 | 24,144 | ? | 81.85 | 83.96 | [75.65, 78.46] | [87.76, 90.48] | [94.16, 96.28] | [93.92, 96.16] |
Spanish | 1,271 | 21,247 | ? | 86.18 | 86.71 | [78.20, 80.06] | [87.72, 90.27] | [90.15, 94.86] | [91.84, 93.70] |
Serbo-Croatian | 1,190 | 20,128 | ? | 75.22 | 79.67 | [75.36, 78.79] | [75.36, 77.28] | ||
Russian | 451 | 10,171 | ? | 75.63 | 79.52 | [70.49, 72.94] | [74.68, 78.65] | n/a | n/a |
Kazakh | 403 | 4,348 | ? | 80.79 | 86.19 | [84.36, 87.79] | [85.56, 88.72] | n/a | n/a |
Portuguese | 119 | 3,823 | ? | 72.54 | 87.34 | [77.10, 87.72] | [84.05, 91.96] | ||
Swedish | 11 | 239 | ? | 72.90 | 73.86 | [56.00, 82.97] |
Sent = sentences, Tok = tokens, Amb = average ambiguity from the morphological analyser
[edit] Systems
-
1st
: Selects the first analysis from the morphological analyser -
CG
: Uses the CG (from the monolingual language package in languages) to preprocess the input. -
Unigram
: Lexicalised unigram tagger -
apertium-tagger
: Uses the bigram HMM tagger included with Apertium.
[edit] Corpora
The tagged corpora used in the experiments are found in the monolingual packages in languages, under the texts/
subdirectory.