Difference between revisions of "Apertium-test/teststats/"

From Apertium
Jump to navigation Jump to search
 
(40 intermediate revisions by the same user not shown)
Line 14: Line 14:
 
*'''[https://svn.code.sf.net/p/apertium/svn/languages/apertium-kaz/apertium-kaz.kaz.rlx rlx rules]''': <section begin=rlx_rules />140<section end=rlx_rules /> as of r83396 by spectre360 ~ [[User:StemCounterBot|StemCounterBot]] ([[User talk:StemCounterBot|talk]]) 09:57, 24 December 2017 (CET), run by grzegorzs_
 
*'''[https://svn.code.sf.net/p/apertium/svn/languages/apertium-kaz/apertium-kaz.kaz.rlx rlx rules]''': <section begin=rlx_rules />140<section end=rlx_rules /> as of r83396 by spectre360 ~ [[User:StemCounterBot|StemCounterBot]] ([[User talk:StemCounterBot|talk]]) 09:57, 24 December 2017 (CET), run by grzegorzs_
 
*'''[https://svn.code.sf.net/p/apertium/svn/languages/apertium-kaz/apertium-kaz.kaz.lexc vanilla stems]''': <section begin=vanilla_stems />26,748<section end=vanilla_stems /> as of r83396 by spectre360 ~ [[User:StemCounterBot|StemCounterBot]] ([[User talk:StemCounterBot|talk]]) 09:57, 24 December 2017 (CET), run by grzegorzs_
 
*'''[https://svn.code.sf.net/p/apertium/svn/languages/apertium-kaz/apertium-kaz.kaz.lexc vanilla stems]''': <section begin=vanilla_stems />26,748<section end=vanilla_stems /> as of r83396 by spectre360 ~ [[User:StemCounterBot|StemCounterBot]] ([[User talk:StemCounterBot|talk]]) 09:57, 24 December 2017 (CET), run by grzegorzs_
  +
[[Category:Datastats]]
 
 
== Corpora ==
 
== Corpora ==
   
  +
[https://google.com google]
Әуезов
 
* words: <section begin=Әуезов-words />155K<section end=Әуезов-words />
 
* coverage: ~<section begin=Әуезов-coverage />92.89<section end=Әуезов-coverage />%
 
* as of: r65751
 
* wikipage: <section begin=Әуезов-wikipage />Әуезов corpus<section end=Әуезов-wikipage />
 
 
bible
 
* words: <section begin=bible-words />577K<section end=bible-words />
 
* coverage: ~<section begin=bible-coverage />95.29<section end=bible-coverage />%
 
* as of: r65751
 
 
azattyq2010
 
* words: <section begin=azattyq2010-words />3.2M<section end=azattyq2010-words />
 
* coverage: ~<section begin=azattyq2010-coverage />95.07<section end=azattyq2010-coverage />%
 
* as of: r65817
 
* wikipage: <section begin=azattyq2010-wikipage />RFERL_corpora<section end=azattyq2010-wikipage />
 
 
wp2011
 
* words: <section begin=wp2011-words />850K<section end=wp2011-words />
 
* coverage: ~<section begin=wp2011-coverage />90.72<section end=wp2011-coverage />%
 
* as of: r65751
 
 
wp2013
 
* words: <section begin=wp2013-words />18.2M<section end=wp2013-words />
 
* coverage: ~<section begin=wp2013-coverage />90.10<section end=wp2013-coverage />%
 
* as of: r65751
 
 
quran
 
* words: <section begin=quran-words />107K<section end=quran-words />
 
* coverage: ~<section begin=quran-coverage />96.71<section end=quran-coverage />%
 
* as of: r65751
 
 
UDHR
 
* words: <section begin=udhr-words />1.5K<section end=udhr-words />
 
* coverage: ~<section begin=udhr-coverage />96.86<section end=udhr-coverage />%
 
* as of: r65817
 
* wikipage: <section begin=udhr-wikipage />UDHR<section end=udhr-wikipage /
 
   
 
wp2017
 
wp2017
* words: <section begin=wp2017-words />4.8M<section end=wp2017-words />
+
* words: <section begin=wp2017-words />3.8M<section end=wp2017-words />
* coverage: ~<section begin=wp2017-coverage />50.3<section end=wp2017-coverage />%
+
* coverage: ~<section begin=wp2017-coverage />93.3<section end=wp2017-coverage />%
* as of: r7644
 
 
wp2017
 
* words: <section begin=wp2017-words />4.8M<section end=wp2017-words />
 
* coverage: ~<section begin=wp2017-coverage />50.3<section end=wp2017-coverage />%
 
* as of: r7644
 
 
wp2017
 
* words: <section begin=wp2017-words />4.8M<section end=wp2017-words />
 
* coverage: ~<section begin=wp2017-coverage />50.3<section end=wp2017-coverage />%
 
 
* as of: r76449
 
* as of: r76449
   
wp2017
 
* words: <section begin=wp2017-words />4.8M<section end=wp2017-words />
 
* coverage: ~<section begin=wp2017-coverage />50.3<section end=wp2017-coverage />%
 
* as of: r76449
 
 
[[Category:Datastats]]
 
[[Category:Datastats]]

Latest revision as of 19:49, 3 January 2018

The language[edit]

In Apertium[edit]


Over-all stats[edit]

Corpora[edit]

google

wp2017

  • words: 3.8M
  • coverage: ~93.3%
  • as of: r76449