Difference between revisions of "Talk:Afrikaans and English"

From Apertium
Jump to navigation Jump to search
 
(12 intermediate revisions by 3 users not shown)
Line 1: Line 1:
{{TOCD}}

==Performance==
==Performance==

Some basic statistics on the quality (or rather coverage) of the Afrikaans->English system.

* Unknown, <code>*</code> &mdash; these words have not been able to be analysed (they are not in the monolingual dictionary).
* Transfer error, <code>#</code> &mdash; there has been an error in transfer (some symbols cannot be resolved).
* Untranslated, <code>@</code> &mdash; the word is analysed but a translation does not appear in the bilingual dictionary.

===29 October 2007===

All texts from Die Volksblad.

{|class=sortable
! Filename !! Word count !! Unknown !! (%) !! Transfer error !! (%) !! Untranslated !! (%)
|-
| 001.txt || 383 || 46 || ~12.01 || 49 || ~12.79 || 3 || ~0.783
|-
| 002.txt || 582 || 151 || ~25.94 || 60 || ~10.30 || 11 || ~1.890
|-
| 003.txt || 943 || 233 || ~24.70 || 112 || ~11.87 || 20 || ~2.120
|-
| 004.txt || 747 || 176 || ~23.56 || 59 || ~7.898 || 31 || ~4.149
|-
| 005.txt || 201 || 16 || ~7.960 || 18 || ~8.955 || 3 || ~1.492
|-
| 006.txt || 307 || 56 || ~18.24 || 27 || ~8.794 || 3 || ~0.977
|-
| 007.txt || 417 || 101 || ~24.22 || 36 || ~8.633 || 7 || ~1.678
|-
| 008.txt || 405 || 89 || ~21.97 || 27 || ~6.666 || 9 || ~2.222
|-
| 009.txt || 304 || 61 || ~20.06 || 43 || ~14.14 || 6 || ~1.973
|-
| 010.txt || 157 || 16 || ~10.19 || 16 || ~10.19 || 1 || ~0.636
|-
| 011.txt || 335 || 55 || ~16.41 || 35 || ~10.44 || 4 || ~1.194
|-
| 012.txt || 464 || 62 || ~13.36 || 62 || ~13.36 || 6 || ~1.293
|-
| 013.txt || 155 || 20 || ~12.90 || 18 || ~11.61 || 0 || 0
|-
| 014.txt || 423 || 55 || ~13.00 || 64 || ~15.13 || 6 || ~1.418
|-
| 015.txt || 410 || 83 || ~20.24 || 35 || ~8.536 || 10 || ~2.439
|-
| 016.txt || 347 || 47 || ~13.54 || 52 || ~14.98 || 8 || ~2.305
|-
| 017.txt || 357 || 78 || ~21.84 || 41 || ~11.48 || 8 || ~2.240
|-
| 018.txt || 491 || 45 || ~9.164 || 78 || ~15.88 || 4 || ~0.814
|-
| 019.txt || 378 || 72 || ~19.04 || 38 || ~10.05 || 6 || ~1.587
|-
| 020.txt || 592 || 116 || ~19.59 || 58 || ~9.797 || 3 || ~0.506
|-
| 021.txt || 335 || 57 || ~17.01 || 29 || ~8.656 || 3 || ~0.895
|-
| 022.txt || 327 || 69 || ~21.10 || 22 || ~6.727 || 11 || ~3.363
|-
| 023.txt || 311 || 63 || ~20.25 || 29 || ~9.324 || 8 || ~2.572
|-
| 024.txt || 183 || 26 || ~14.20 || 16 || ~8.743 || 2 || ~1.092
|-
| 025.txt || 226 || 43 || ~19.02 || 19 || ~8.407 || 0 || 0
|-
| 026.txt || 283 || 47 || ~16.60 || 27 || ~9.540 || 2 || ~0.706
|-
|}



===12 August 2007===

All texts from Die Volksblad.

{|class=sortable
! Filename !! Word count !! Unknown !! (%) !! Transfer error !! (%) !! Untranslated !! (%)
|-
| 001.txt || 383 || 56 || ~14.62 || 48 || ~12.53 || 3 || ~0.783
|-
| 002.txt || 582 || 181 || ~31.09 || 42 || ~7.216 || 12 || ~2.061
|-
| 003.txt || 943 || 300 || ~31.81 || 97 || ~10.28 || 16 || ~1.696
|-
| 004.txt || 747 || 235 || ~31.45 || 44 || ~5.890 || 22 || ~2.945
|-
| 005.txt || 201 || 31 || ~15.42 || 13 || ~6.467 || 5 || ~2.487
|-
| 006.txt || 307 || 73 || ~23.77 || 20 || ~6.514 || 5 || ~1.628
|-
| 007.txt || 417 || 117 || ~28.05 || 38 || ~9.112 || 6 || ~1.438
|-
| 008.txt || 405 || 114 || ~28.14 || 21 || ~5.185 || 9 || ~2.222
|-
| 009.txt || 304 || 74 || ~24.34 || 35 || ~11.51 || 5 || ~1.644
|-
| 010.txt || 157 || 21 || ~13.37 || 16 || ~10.19 || 0 || 0
|-
| 011.txt || 335 || 66 || ~19.70 || 33 || ~9.850 || 3 || ~0.895
|-
| 012.txt || 464 || 80 || ~17.24 || 60 || ~12.93 || 4 || ~0.862
|-
| 013.txt || 155 || 29 || ~18.70 || 21 || ~13.54 || 1 || ~0.645
|-
| 014.txt || 423 || 69 || ~16.31 || 60 || ~14.18 || 5 || ~1.182
|-
| 015.txt || 410 || 110 || ~26.82 || 33 || ~8.048 || 11 || ~2.682
|-
| 016.txt || 346 || 83 || ~23.98 || 38 || ~10.98 || 11 || ~3.179
|-
| 017.txt || 357 || 98 || ~27.45 || 31 || ~8.683 || 9 || ~2.521
|-
| 018.txt || 491 || 74 || ~15.07 || 59 || ~12.01 || 10 || ~2.036
|-
| 019.txt || 378 || 84 || ~22.22 || 27 || ~7.142 || 6 || ~1.587
|-
| 020.txt || 592 || 135 || ~22.80 || 58 || ~9.797 || 4 || ~0.675
|-
|}


===22 July 2007===

All texts from Die Volksblad.

{|class=sortable
! Filename !! Word count !! Unknown (%) !! Transfer error (%) !! Untranslated (%)
|-
| 001.txt || 383 || 59 (~15.40) || 44 (~11.48) || 2 (~0.522)
|-
| 002.txt || 582 || 217 (~37.28) || 29 (~4.982) || 4 (~0.687)
|-
| 003.txt || 943 || 364 (~38.60) || 77 (~8.165) || 4 (~0.424)
|-
| 004.txt || 747 || 285 (~38.15) || 34 (~4.551) || 11 (~1.472)
|-
| 005.txt || 201 || 33 (~16.41) || 12 (~5.970) || 5 (~2.487)
|-
| 006.txt || 307 || 87 (~28.33) || 18 (~5.863) || 0 (0)
|-
| 007.txt || 417 || 142 (~34.05) || 23 (~5.515) || 1 (~0.239)
|-
| 008.txt || 405 || 130 (~32.09) || 11 (~2.716) || 2 (~0.493)
|-
| 009.txt || 304 || 94 (~30.92) || 28 (~9.210) || 2 (~0.657)
|-
| 010.txt || 157 || 24 (~15.28) || 16 (~10.19) || 0 (0)
|-
| 011.txt || 335 || 75 (~22.38) || 31 (~9.253) || 1 (~0.298)
|-
| 012.txt || 464 || 83 (~17.88) || 56 (~12.06) || 4 (~0.862)
|-
| 013.txt || 155 || 30 (~19.35) || 20 (~12.90) || 0 (0)
|-
| 014.txt || 423 || 85 (~20.09) || 60 (~14.18) || 1 (~0.236)
|-
| 015.txt || 410 || 137 (~33.41) || 22 (~5.365) || 5 (~1.219)
|-
| 016.txt || 346 || 100 (~28.90) || 36 (~10.40) || 3 (~0.867)
|-
|}


===08 July 2007===
===08 July 2007===

Latest revision as of 16:21, 20 October 2010

Performance[edit]

Some basic statistics on the quality (or rather coverage) of the Afrikaans->English system.

  • Unknown, * — these words have not been able to be analysed (they are not in the monolingual dictionary).
  • Transfer error, # — there has been an error in transfer (some symbols cannot be resolved).
  • Untranslated, @ — the word is analysed but a translation does not appear in the bilingual dictionary.

29 October 2007[edit]

All texts from Die Volksblad.

Filename Word count Unknown (%) Transfer error (%) Untranslated (%)
001.txt 383 46 ~12.01 49 ~12.79 3 ~0.783
002.txt 582 151 ~25.94 60 ~10.30 11 ~1.890
003.txt 943 233 ~24.70 112 ~11.87 20 ~2.120
004.txt 747 176 ~23.56 59 ~7.898 31 ~4.149
005.txt 201 16 ~7.960 18 ~8.955 3 ~1.492
006.txt 307 56 ~18.24 27 ~8.794 3 ~0.977
007.txt 417 101 ~24.22 36 ~8.633 7 ~1.678
008.txt 405 89 ~21.97 27 ~6.666 9 ~2.222
009.txt 304 61 ~20.06 43 ~14.14 6 ~1.973
010.txt 157 16 ~10.19 16 ~10.19 1 ~0.636
011.txt 335 55 ~16.41 35 ~10.44 4 ~1.194
012.txt 464 62 ~13.36 62 ~13.36 6 ~1.293
013.txt 155 20 ~12.90 18 ~11.61 0 0
014.txt 423 55 ~13.00 64 ~15.13 6 ~1.418
015.txt 410 83 ~20.24 35 ~8.536 10 ~2.439
016.txt 347 47 ~13.54 52 ~14.98 8 ~2.305
017.txt 357 78 ~21.84 41 ~11.48 8 ~2.240
018.txt 491 45 ~9.164 78 ~15.88 4 ~0.814
019.txt 378 72 ~19.04 38 ~10.05 6 ~1.587
020.txt 592 116 ~19.59 58 ~9.797 3 ~0.506
021.txt 335 57 ~17.01 29 ~8.656 3 ~0.895
022.txt 327 69 ~21.10 22 ~6.727 11 ~3.363
023.txt 311 63 ~20.25 29 ~9.324 8 ~2.572
024.txt 183 26 ~14.20 16 ~8.743 2 ~1.092
025.txt 226 43 ~19.02 19 ~8.407 0 0
026.txt 283 47 ~16.60 27 ~9.540 2 ~0.706


12 August 2007[edit]

All texts from Die Volksblad.

Filename Word count Unknown (%) Transfer error (%) Untranslated (%)
001.txt 383 56 ~14.62 48 ~12.53 3 ~0.783
002.txt 582 181 ~31.09 42 ~7.216 12 ~2.061
003.txt 943 300 ~31.81 97 ~10.28 16 ~1.696
004.txt 747 235 ~31.45 44 ~5.890 22 ~2.945
005.txt 201 31 ~15.42 13 ~6.467 5 ~2.487
006.txt 307 73 ~23.77 20 ~6.514 5 ~1.628
007.txt 417 117 ~28.05 38 ~9.112 6 ~1.438
008.txt 405 114 ~28.14 21 ~5.185 9 ~2.222
009.txt 304 74 ~24.34 35 ~11.51 5 ~1.644
010.txt 157 21 ~13.37 16 ~10.19 0 0
011.txt 335 66 ~19.70 33 ~9.850 3 ~0.895
012.txt 464 80 ~17.24 60 ~12.93 4 ~0.862
013.txt 155 29 ~18.70 21 ~13.54 1 ~0.645
014.txt 423 69 ~16.31 60 ~14.18 5 ~1.182
015.txt 410 110 ~26.82 33 ~8.048 11 ~2.682
016.txt 346 83 ~23.98 38 ~10.98 11 ~3.179
017.txt 357 98 ~27.45 31 ~8.683 9 ~2.521
018.txt 491 74 ~15.07 59 ~12.01 10 ~2.036
019.txt 378 84 ~22.22 27 ~7.142 6 ~1.587
020.txt 592 135 ~22.80 58 ~9.797 4 ~0.675


22 July 2007[edit]

All texts from Die Volksblad.

Filename Word count Unknown (%) Transfer error (%) Untranslated (%)
001.txt 383 59 (~15.40) 44 (~11.48) 2 (~0.522)
002.txt 582 217 (~37.28) 29 (~4.982) 4 (~0.687)
003.txt 943 364 (~38.60) 77 (~8.165) 4 (~0.424)
004.txt 747 285 (~38.15) 34 (~4.551) 11 (~1.472)
005.txt 201 33 (~16.41) 12 (~5.970) 5 (~2.487)
006.txt 307 87 (~28.33) 18 (~5.863) 0 (0)
007.txt 417 142 (~34.05) 23 (~5.515) 1 (~0.239)
008.txt 405 130 (~32.09) 11 (~2.716) 2 (~0.493)
009.txt 304 94 (~30.92) 28 (~9.210) 2 (~0.657)
010.txt 157 24 (~15.28) 16 (~10.19) 0 (0)
011.txt 335 75 (~22.38) 31 (~9.253) 1 (~0.298)
012.txt 464 83 (~17.88) 56 (~12.06) 4 (~0.862)
013.txt 155 30 (~19.35) 20 (~12.90) 0 (0)
014.txt 423 85 (~20.09) 60 (~14.18) 1 (~0.236)
015.txt 410 137 (~33.41) 22 (~5.365) 5 (~1.219)
016.txt 346 100 (~28.90) 36 (~10.40) 3 (~0.867)

08 July 2007[edit]

Newspaper Filename Total Unknown Percentage OOV
Die Volksblad 001.txt 383 110 28.7
Die Volksblad 002.txt 582 258 44.3
Die Volksblad 003.txt 943 433 45.9
Die Volksblad 004.txt 747 337 45.1
Die Volksblad 005.txt 201 71 35.3
Die Volksblad 006.txt 307 121 39.4
Die Volksblad 007.txt 417 185 44.3
Die Volksblad 008.txt 405 157 38.7
Die Volksblad 009.txt 304 120 39.4
Beeld 001.txt 309 116 37.5

26 June 2007[edit]

Over a selection of texts from Die Volksblad and Beeld:

Newspaper Unknown/Total Filename Percentage OOV
Die Volksblad 117/383 001.txt 30.5
Die Volksblad 267/582 002.txt 45.8
Die Volksblad 449/943 003.txt 47.6
Die Volksblad 347/747 004.txt 46.4
Die Volksblad 77/201 005.txt 38.3
Beeld 123/309 001.txt 39.8