User:Popcorndude/Unit-Testing
Proposed file structure for unit testing framework.
Specification
make test
will check for a file named tests/tests.yaml
which can include other files in the same directory.
At the top level of a test file, include
is a list of included files (paths given relative to the directory of the current file). All other keys are names of tests.
Each test consists of a collection of pairs of strings which can be written as left: right
or placed in a TSV file and referenced with tsv-file: [path]
.
Each test also specifies what mode to run. This specification can be either lr
(left side is input) or rl
(right side is input). Bidirectional tests may be created by listing both.
[direction]: mode: [mode-name] match: [match-mode] options: [option-string]
match-mode
can be one of exact
(output must be exactly as written in the test), include
(the values in the test must be present in the output), or exclude
(the values in the test must not be present in the output). This defaults to exact
if not specified.
option-string
will be passed to the apertium
executable. It defaults to -f none
.
If neither match
nor options
is specified, this can be abbreviated to [direction]: [mode-name]
.
Example Files
apertium-eng-spa/tests/tests.yaml
include: - other_file_1.yaml - other_file_2.yaml "possession": lr: eng-spa rl: spa-eng "the cat's box": "la caja del gato" "my sister's socks": "los calcetines de me hermana" "noun/verb disam": lr: mode: eng-spa-tagger match: exact "the cat's box": "^the/the<det><def><sp>$ ^cat's/cat<n><sg>+'s<gen>$ ^box/box<n><sg>$"
apertium-eng/tests/tests.yaml
"past tense": lr: mode: eng-morph match: include rl: mode: eng-gener match: exact tsv-file: past-tense-tests.tsv "disam": lr: mode: eng-tagger match: exclude "to be purple": "^to/$ ^be/be<vbser><imp>$ ^purple/$"
apertium-eng/tests/past-tense-tests.tsv
sang ^sing<vblex><past>$ jumped ^jump<vblex><past>$
Annotated Example Files
apertium-eng-spa/tests/tests.yaml
include: # run the tests in these files as well - other_file_1.yaml - other_file_2.yaml "possession": # test named "possession" lr: eng-spa # left side | apertium eng-spa => right side rl: spa-eng # right side | apertium spa-eng => left side "the cat's box": "la caja del gato" "my sister's socks": "los calcetines de me hermana" "noun/verb disam": # test named "noun/verb disam" lr: # only has 1 direction mode: eng-spa-tagger match: exact # output must match what we've written below exactly # this is the default, but we're being explicit "the cat's box": "^the/the<det><def><sp>$ ^cat's/cat<n><sg>+'s<gen>$ ^box/box<n><sg>$"
apertium-eng/tests/tests.yaml
"past tense": lr: mode: eng-morph match: include # the right side of the test must appear in the output # but the test will still pass if other things appear as well rl: mode: eng-gener match: exact tsv-file: past-tense-tests.tsv # read the test data from a tab-separated list "disam": lr: mode: eng-tagger match: exclude # the output can contain other things, but must not contain # the readings listed "to be purple": "^to/$ ^be/be<vbser><imp>$ ^purple/$"