raiden-contracts
raiden-contracts copied to clipboard
formalized and randomized (smart-contracts) testing
Problem Definition
Currently we rely on the developers to identify, define and test edge cases. Whenever the developer believes something to be not sufficiently tested, re-testing issues like #89, #92 are created.
Status
From this the question arises, when we consider something to be sufficiently tested. This issue is especially pressuring for smart contract testing. When discussing #105 with @loredanacirstea, we agreed that the approach taken here does not solve the problem, as it mostly adds to the quantity of smart-contracts testing, where we should address both, quantity and quality.
Proposal
Formalize the protocol rules of crucial parts of /raiden
and /raiden-contracts
. From this formalization, derive random assignments for the data structures to be tested via a generalized testing fixture.
To be discussed/questions
To generate pathologic cases, we only have to randomly select a subset of the constraints to be violated. This should reliably trigger errors. The major advantage of probabilistic testing is imo, that we might also discover bugs we did not account for on the long run. The formalization of protocol rules helps in reusability and gives a definite answer to when testing is sufficient and could therefore speed up our development workflow in the future.
- [x] discuss viability, feasibility and benefits
- [x] approve or dismiss
- [ ] if approve, decide on timeline
The test was disabled until #108 is (partially) implemented:
- [ ] Implement correct exectution paths to be exectuted by the state machine (i.e.
open -> close: true
,close -> open: false
,open -> close -> open: true
, etc.) - [ ] Implement validity/invalidity of sampled data (i.e.
transferred_amount <= deposit_1 + deposit_2 - withdrawn_1 - withdrawn_2 : true, [...] >= [...]: False
In other words: We need to define correct/incorrect state transitions for the state machine test implementation, s.t. we can assert tests passing/failing properly. Additionally, we need to define validity/invalidity of the sampled data and their relation, i.e. those 4 cases and their expected results:
-
valid exectution path x valid values -> test_passes
-
invalid exectution path x valid values -> test_fails
-
valid exectution path x invalid valid values -> test_fails
-
invalid exectution path x invalid values -> test_fails
Take a look at the PR that removed the property and smart tests which we already discussed you guys should make sure is now covered by tests in the smart contracts repo.
The property tests using the hypothesis
modules tried to do something close to what you are describing.
I looked into hypothesis, but I think it its beneficial to do that by ourselves, as we should fully control the testing and might also reuse the formalizations for docs/tutorials. When do you expect this to be done? This is not covered by issues so far.
When do you expect this to be done?
I don't expect this to be done by red eyes. Having more formalized tests is an improvement on our testing infrastructure, which though nice is definitely not needed to launch Red Eyes.
I looked into hypothesis, but I think it its beneficial to do that by ourselves, as we should fully control the testing and might also reuse the formalizations for docs/tutorials.
I am not convinced. I would go for the minimal effort way of re-using an existing library rather than creating a driver for these scenarios property testing on our own.
I am not convinced. I would go for the minimal effort way of re-using an existing library rather than creating a driver for these scenarios property testing on our own.
Hypthesis could be used for /raiden
but I don't think it will help with smart contracts testing 'cause solidity.
@err508 It's probably best to use the hypothesis library. Randomized tests will by definition generate a lot of noise, the hardest part of understating a failed random test is to extract the minimum reproducible example, and that is exactly why the library exists.
reuse the formalizations for docs/tutorials
Do you have any formal languages to describe the expected behavior? (At some point I considered using something like TLA+, but that in itself is a huge effort)
@hackaugusto I still don't see how we would use Hypothesis for SC testing. I would just log timestamps + test inits before running the test. If travis has some form of logging or notification if tests fail, this should be easy. For formal languages I'd just go for tex, especially in the scenario I described good old mathematics will sure be expressive enough.
@err508 I'd try lots of random add/remove on the hashlock tree, and see if the accounting holds afterwards. I remember a bug in a binary tree in Haskell Data.Map. The bug was easy to see with a property test (lots of random add/remove to the tree; and see if the tree is still balanced).
@pirapira Impressive finding! Testing the integrity of the implementation was something I did not have in mind when creating this issue, I was thinking about a formalization like in #188 that we can use to define distributions of valid and invalid values for fuzzy testing in order to gain confidence that the implementation works correctly. This is another step!
In raiden
, test_state_changes.py
does some property based random testing. Maybe this example can be followed to some extent.
Some unused hypothesis code has been removed in https://github.com/raiden-network/raiden-contracts/pull/1360. When working on this issue, consider updating and re-adding that code.