oasis-core The e2e test cases are close to unmaintainable

The e2e test cases are close to unmaintainable

Open Yawning opened this issue 4 years ago • 1 comments

tldr; our e2e tests are bad and we should feel bad

Every time I need to add functionality or debug anything that involves go/oasis-test-runner or the byzantine node, things end up taking way longer than they should. As far as I can tell this is attributable to a few reasons.

oasis-test-runner and our test harness code has organically grown a mountain of overcomplicated/duplicated functionality and kludges that makes maintenance a total nightmare.
Most of our test cases are written with a lot of assumptions about how the system operates and takes shortcuts that make them exceedingly fragile to change (eg: assumptions about how timekeeping works, that I'm trying to fix).
The byzantine node is a gigantic kludge that also makes a lot of assumptions about how the system operates, with numerous nasty hacks that should have never been merged in the first place (in particular the old method of ensuring that the node is elected in the right spot is awful), and from a high level abstraction/code quality standpoint leaves much to be desired.

Admitedly, I am partly to blame for writing oasis-test-runner and some of the test cases to begin with, but from what I remember (in my biased view) my initial import was nowhere near this nightmarish.

Nov 04 '21 12:11 Yawning

Just so I hopefully remember the next time this happens before I spend a few hours trying to figure out why a entirely unrelated change suddenly starts making e2e/runtime/txsource-multi-short, if the failures appear to be gRPC related, it is the test's fault, and not mine. Next time I will sit there smashing retry repeatedly.

Dec 01 '21 20:12 Yawning

oasis-core oasis-core copied to clipboard

The e2e test cases are close to unmaintainable

oasis-core
oasis-core copied to clipboard