raiden-contracts icon indicating copy to clipboard operation
raiden-contracts copied to clipboard

formalized and randomized (smart-contracts) testing

Open err508 opened this issue 6 years ago • 10 comments

Problem Definition

Currently we rely on the developers to identify, define and test edge cases. Whenever the developer believes something to be not sufficiently tested, re-testing issues like #89, #92 are created.

Status

From this the question arises, when we consider something to be sufficiently tested. This issue is especially pressuring for smart contract testing. When discussing #105 with @loredanacirstea, we agreed that the approach taken here does not solve the problem, as it mostly adds to the quantity of smart-contracts testing, where we should address both, quantity and quality.

Proposal

Formalize the protocol rules of crucial parts of /raiden and /raiden-contracts. From this formalization, derive random assignments for the data structures to be tested via a generalized testing fixture.

To be discussed/questions

To generate pathologic cases, we only have to randomly select a subset of the constraints to be violated. This should reliably trigger errors. The major advantage of probabilistic testing is imo, that we might also discover bugs we did not account for on the long run. The formalization of protocol rules helps in reusability and gives a definite answer to when testing is sufficient and could therefore speed up our development workflow in the future.

  • [x] discuss viability, feasibility and benefits
  • [x] approve or dismiss
  • [ ] if approve, decide on timeline

The test was disabled until #108 is (partially) implemented:

  • [ ] Implement correct exectution paths to be exectuted by the state machine (i.e. open -> close: true, close -> open: false, open -> close -> open: true, etc.)
  • [ ] Implement validity/invalidity of sampled data (i.e. transferred_amount <= deposit_1 + deposit_2 - withdrawn_1 - withdrawn_2 : true, [...] >= [...]: False

In other words: We need to define correct/incorrect state transitions for the state machine test implementation, s.t. we can assert tests passing/failing properly. Additionally, we need to define validity/invalidity of the sampled data and their relation, i.e. those 4 cases and their expected results:

  • valid exectution path x valid values -> test_passes
  • invalid exectution path x valid values -> test_fails
  • valid exectution path x invalid valid values -> test_fails
  • invalid exectution path x invalid values -> test_fails

err508 avatar Jun 05 '18 15:06 err508

Take a look at the PR that removed the property and smart tests which we already discussed you guys should make sure is now covered by tests in the smart contracts repo.

The property tests using the hypothesis modules tried to do something close to what you are describing.

LefterisJP avatar Jun 05 '18 15:06 LefterisJP

I looked into hypothesis, but I think it its beneficial to do that by ourselves, as we should fully control the testing and might also reuse the formalizations for docs/tutorials. When do you expect this to be done? This is not covered by issues so far.

err508 avatar Jun 05 '18 15:06 err508

When do you expect this to be done?

I don't expect this to be done by red eyes. Having more formalized tests is an improvement on our testing infrastructure, which though nice is definitely not needed to launch Red Eyes.

I looked into hypothesis, but I think it its beneficial to do that by ourselves, as we should fully control the testing and might also reuse the formalizations for docs/tutorials.

I am not convinced. I would go for the minimal effort way of re-using an existing library rather than creating a driver for these scenarios property testing on our own.

LefterisJP avatar Jun 05 '18 15:06 LefterisJP

I am not convinced. I would go for the minimal effort way of re-using an existing library rather than creating a driver for these scenarios property testing on our own.

Hypthesis could be used for /raiden but I don't think it will help with smart contracts testing 'cause solidity.

err508 avatar Jun 05 '18 15:06 err508

@err508 It's probably best to use the hypothesis library. Randomized tests will by definition generate a lot of noise, the hardest part of understating a failed random test is to extract the minimum reproducible example, and that is exactly why the library exists.

reuse the formalizations for docs/tutorials

Do you have any formal languages to describe the expected behavior? (At some point I considered using something like TLA+, but that in itself is a huge effort)

hackaugusto avatar Jun 05 '18 16:06 hackaugusto

@hackaugusto I still don't see how we would use Hypothesis for SC testing. I would just log timestamps + test inits before running the test. If travis has some form of logging or notification if tests fail, this should be easy. For formal languages I'd just go for tex, especially in the scenario I described good old mathematics will sure be expressive enough.

err508 avatar Jun 05 '18 18:06 err508

@err508 I'd try lots of random add/remove on the hashlock tree, and see if the accounting holds afterwards. I remember a bug in a binary tree in Haskell Data.Map. The bug was easy to see with a property test (lots of random add/remove to the tree; and see if the tree is still balanced).

pirapira avatar Nov 26 '18 16:11 pirapira

@pirapira Impressive finding! Testing the integrity of the implementation was something I did not have in mind when creating this issue, I was thinking about a formalization like in #188 that we can use to define distributions of valid and invalid values for fuzzy testing in order to gain confidence that the implementation works correctly. This is another step!

err508 avatar Nov 26 '18 17:11 err508

In raiden, test_state_changes.py does some property based random testing. Maybe this example can be followed to some extent.

pirapira avatar Feb 19 '19 10:02 pirapira

Some unused hypothesis code has been removed in https://github.com/raiden-network/raiden-contracts/pull/1360. When working on this issue, consider updating and re-adding that code.

karlb avatar Feb 14 '20 09:02 karlb