scylla-cluster-tests
scylla-cluster-tests copied to clipboard
fix(disrupt_terminate_and_replace_node): raise critical event on failure
If the nemesis cannot leave the cluster in the topological state it was before it should raise a critical error so the test can be stopped.
Add new event for topology failures TopologyFailureEvent.
refs: #9918
Testing
- [ ]
PR pre-checks (self review)
- [x] I added the relevant
backportlabels - [x] I didn't leave commented-out/debugging code
Reminders
- Add New configuration option and document them (in
sdcm/sct_config.py) - Add unit tests to cover my changes (under
unit-test/folder) - Update the Readme/doc folder relevant to this change (if needed)
Why is it needed?
Why is it needed?
By continuing, it leads to issues with other nemesis that affect topology, like GrowShrinkCluster.
Another option is to refactor this nemesis to make sure we move to the part adding node, regardless of what was failing.
Either way, we can't accept a nemesis removing a node and not adding a new node
@cezarmoise what is the future of this PR? Do you plan on continuing with this PR or can it be closed