Difference between revisions of "Talk:Apertium-quality"
Jump to navigation
Jump to search
m (moved Quality control framework to Somewhere you'll never find!) |
m (moved Somewhere you'll never find! to Talk:Apertium-quality) |
(No difference)
|
Latest revision as of 18:20, 21 August 2011
Contents
Menu[edit]
Getting Started[edit]
Technical Documentation[edit]
Notes[edit]
Community Bonding Period[edit]
Week 1 — 25th April[edit]
- Must demonstrate that setuptools can allow a prefix-based installation for non-root users before end of bonding period
- Emailed Francis a written proof of setuptools adequately meeting expectations and requirements.
Week 2 — 2nd May[edit]
- Converted LaTeX source to Wikimedia format, and placed below this section for annotation.
- Completed example regtest.py
- Added Installation and Usage pages, uploaded initial files.
Week 3 — 9th May[edit]
- Fixed a Python regression-related bug in regtest.py
- Fixed a personal regression in setup.py
- Plan to add autogen.sh for config
- Consider using virtualenv for rootless installations
- Fixed installation instructions
- SVN and git now synched
Coding Period[edit]
Week 1 — 23rd May[edit]
- Completed autogen.sh
Todo[edit]
Tests and stats[edit]
Monolingual corpus[edit]
- dicts: Coverage
- rules: Rule counting (CG, apertium-transfer)
- rules: number of rules
- dicts: number of entries (sl mono, sl-tl, tl mono) -- lttoolbox/hfst
- dicts: (monolingual) mean ambiguity
- system: translation speed (per module?)
- dicts: (bilingual) mean fertility -- e.g. number of translations per SL/TL word
- rules: for disambiguation, if there is cg + apertium tagger, how much work does CG do and how much does apertium-tagger do ? (count LU input to CG, LU output from CG and LU output form apertium-tagger)
Tests[edit]
- dictionary tests (e.g. hfst-tester)
- regression tests
- pending tests
- testvoc
- testvoc+bidixvoc (some language pairs have bilingual dictionaries with more than one translation for a given SL word, at the moment testvoc will only ever test the default translation. testvoc+bidixvoc will test them all)
- generation test
- corpus test
Parallel corpus[edit]
- WER, PER, BLEU against reference
Graphs[edit]
- coverage over time
- number of rules over time
- mean ambiguity over time
- number of dict entries over time
- translation speed over time
- WER/PER/BLEU over time
- percentage of regression tests passed over time
Feature Requests[edit]
- Cache the wiki Regression test web page so that we can test when the wiki is offline or when stuck in airports with expensive wifi
Extensions[edit]
Sanity Tests[edit]
Simple allow the use of a sanity_tests directory in a dictionary directory, and if found, run any scripts found in there, storing their name and return value in the quality-stats.xml. This allows the scripts to be in any language given they return non-zero return values on error.
Possible tests:
- Superblank order test