Difference between revisions of "Talk:Apertium-quality"

From Apertium
Jump to navigation Jump to search
Line 1: Line 1:
 
= Menu =
 
= Menu =
  +
=== Getting Started ===
 
* [[Quality_control_framework/Installation|Installation]]
 
* [[Quality_control_framework/Installation|Installation]]
 
* [[Quality_control_framework/Usage|Usage]]
 
* [[Quality_control_framework/Usage|Usage]]
   
  +
=== Technical Documentation ===
 
* [[Quality_control_framework/Proposal|Proposal]]
 
* [[Quality_control_framework/Proposal|Proposal]]
  +
* [[Quality_control_framework/XML_Schema|XML Schema]]
   
 
= Notes =
 
= Notes =

Revision as of 15:18, 24 May 2011

Menu

Getting Started

Technical Documentation

Notes

Community Bonding Period

Week 1 — 25th April

  • Must demonstrate that setuptools can allow a prefix-based installation for non-root users before end of bonding period
  • Emailed Francis a written proof of setuptools adequately meeting expectations and requirements.

Week 2 — 2nd May

  • Converted LaTeX source to Wikimedia format, and placed below this section for annotation.
  • Completed example regtest.py
  • Added Installation and Usage pages, uploaded initial files.

Week 3 — 9th May

  • Fixed a Python regression-related bug in regtest.py
  • Fixed a personal regression in setup.py
  • Plan to add autogen.sh for config
  • Consider using virtualenv for rootless installations
  • Fixed installation instructions
  • SVN and git now synched

Coding Period

Week 1 — 23rd May

  • Completed autogen.sh

Todo

  1. Complete the todo.


Tests and stats

Monolingual corpus
  • dicts: Coverage
  • rules: Rule counting (CG, apertium-transfer)
  • rules: number of rules
  • dicts: number of entries (sl mono, sl-tl, tl mono) -- lttoolbox/hfst
  • dicts: mean ambiguity
  • system: translation speed (per module?)
Tests
  • dictionary tests (e.g. hfst-tester)
  • regression tests
  • pending tests
  • testvoc
  • generation test
  • corpus test
Parallel corpus
  • WER, PER, BLEU against reference
Graphs
  • coverage over time
  • number of rules over time
  • mean ambiguity over time
  • number of dict entries over time
  • translation speed over time
  • WER/PER/BLEU over time
  • percentage of regression tests passed over time