pyret-lang
pyret-lang copied to clipboard
Better support for property based testing
Currently, it's very natural to express properties using satisfies
, and with a fix for #1633, debugging errors that are uncovered by generated input is quite straightforward. The last typical piece of property based testing is a way to shrink bad test inputs (probably that either cause errors or fail to pass the predicate).
It's possible to do this currently, but quite a bit of work -- you essentially have to run the tests multiple times outside of the testing framework, and then only actually emit a satisfies
when you know you've gotten to the smallest case (and, you have to define this helper code within the test block, as the satisfies
can't appear elsewhere) -- which really isn't great.
I wonder if it's worth adding built in support for this pattern, i.e., something like:
generate-input() satisfies my-property minimize my-shrink
Or something of that sort. The semantics would be:
- If
generate-input()
errors, we just report the error - If
my-property
either raises an error or returns false on the left side value, we callmy-shrink
on that input.my-shrink
should return an option: if it'snone
, we can't make it smaller, so we report the failing test. If it issome(v)
we re-runmy-property
on it. If it still fails, we keep going, whereas if it passes, we instead report the previous input as our smallest erroneous case.
Some open questions:
- does the new error have to be identical? i.e., if when the input got smaller, the reason for the test failing changed, do we still want to take the smaller input? For simplicity, probably the answer is yes -- the goal is to return the smallest input that did not pass the test.
- Do we want to also show the original failing input? i.e., show that the test failed on some input, and then that we shrunk it to some other input that still failed. If we do this, whether the error is the same matters, as if we don't ensure that, we should show both errors.
Quick ideas: I think a generalization of this is that there are many cases where the calculation's result, along with other information about the expression on the LHS, ought to be processed before being reported. I could see having some kind of a report
or format
or process
option for all testing forms that takes in information about the test and post-processes it.
It could also be that satisfies
takes a %(post-process)
that allows for this kind of refinement.
I like the idea of satisfies%(post-process)
(I initially was wondering if there was a way to get this to work using just is%(something)
, but the visibility of the smaller input is the problem), as it does seem like there should be a generalization that works.
I think, at least for this use case, what post-process
would need would be: the original input, whether the test passed, and, when it returns, indicating if the test should be re-run on a different input.
i.e., something like:
{ input : A, result : Exn | Fail | Pass} -> ReportResult | Rerun A
Perhaps ReportResult
above could have additional info (if the idea was that you were giving further clarification as to why a particular input would have failed...).