Difference between revisions of "Talk:Apertium-quality"

From Apertium
Jump to navigation Jump to search
Line 38: Line 38:
* dicts: Coverage
* dicts: Coverage
* rules: Rule counting (CG, apertium-transfer)
* rules: Rule counting (CG, apertium-transfer)
* rules: number of rules
* dicts: Mean ambiguity
* dicts: number of entries (sl mono, sl-tl, tl mono) -- lttoolbox/hfst
* dicts: number of entries (sl mono, sl-tl, tl mono) -- lttoolbox/hfst
* dicts: mean ambiguity
* dicts: mean ambiguity
Line 52: Line 52:


* WER, PER, BLEU against reference
* WER, PER, BLEU against reference

;Graphs

* coverage over time
* number of rules over time
* mean ambiguity over time
* number of dict entries over time
* translation speed over time
* WER/PER/BLEU over time
* percentage of regression tests passed over time

Revision as of 16:57, 22 May 2011

Menu

Notes

Community Bonding Period

Week 1 — 25th April

  • Must demonstrate that setuptools can allow a prefix-based installation for non-root users before end of bonding period
  • Emailed Francis a written proof of setuptools adequately meeting expectations and requirements.

Week 2 — 2nd May

  • Converted LaTeX source to Wikimedia format, and placed below this section for annotation.
  • Completed example regtest.py
  • Added Installation and Usage pages, uploaded initial files.

Week 3 — 9th May

  • Fixed a Python regression-related bug in regtest.py
  • Fixed a personal regression in setup.py
  • Plan to add autogen.sh for config
  • Consider using virtualenv for rootless installations
  • Fixed installation instructions
  • SVN and git now synched

Coding Period

Week 1 — 23rd May

  • Completed autogen.sh

Todo

  1. Complete the todo.


Tests and stats

Monolingual corpus
  • dicts: Coverage
  • rules: Rule counting (CG, apertium-transfer)
  • rules: number of rules
  • dicts: number of entries (sl mono, sl-tl, tl mono) -- lttoolbox/hfst
  • dicts: mean ambiguity
  • system: translation speed (per module?)
Tests
  • dictionary tests (e.g. hfst-tester)
  • regression tests
  • pending tests
Parallel corpus
  • WER, PER, BLEU against reference
Graphs
  • coverage over time
  • number of rules over time
  • mean ambiguity over time
  • number of dict entries over time
  • translation speed over time
  • WER/PER/BLEU over time
  • percentage of regression tests passed over time