scylla-cluster-tests icon indicating copy to clipboard operation
scylla-cluster-tests copied to clipboard

fix(nodebootstrapabortmanager): always clean scylla data before rebootstrap

Open aleksbykov opened this issue 1 year ago • 6 comments

with raft topology, node which failed to bootstrap , removed from cluster and banned. Need to clean all scylla data before rebootstrap. Depending on log message after which bootstrap is going to be aborted, Bootstrap process could be finished first than it was aborted and node could be added to cluster. Added additional checks for successful bootstrap before aborting process
Also added more error log messages with warning severity, because they are expected if bootstrap process was aborted.

Testing

PR pre-checks (self review)

  • [ ] I added the relevant backport labels
  • [ ] I didn't leave commented-out/debugging code

Reminders

  • Add New configuration option and document them (in sdcm/sct_config.py)
  • Add unit tests to cover my changes (under unit-test/ folder)
  • Update the Readme/doc folder relevant to this change (if needed)

aleksbykov avatar May 26 '24 08:05 aleksbykov

Fix for issue: https://github.com/scylladb/scylla-cluster-tests/issues/7448

aleksbykov avatar Jun 24 '24 07:06 aleksbykov

@aleksbykov still draft?

soyacz avatar Jun 24 '24 15:06 soyacz

@aleksbykov still draft?

@soyacz it is in staging now. working with it

aleksbykov avatar Jun 26 '24 05:06 aleksbykov

@aleksbykov still draft?

@soyacz it is in staging now. working with it

@temichus it was in staging month ago ? can someone attend this one ?

fruch avatar Jul 25 '24 11:07 fruch

@aleksbykov still draft?

@soyacz it is in staging now. working with it

@temichus it was in staging month ago ? can someone attend this one ?

@fruch it's on Stabilization/testing. @aleksbykov will continue working on it next week

temichus avatar Jul 25 '24 12:07 temichus

To avoid a lot of update messages, all commits and current activity is on branch https://github.com/aleksbykov/scylla-cluster-tests/commits/fix-bootstrap-after-bootstrap-failed-staging/ All jobs are running under: https://argus.scylladb.com/test/d635e138-c81f-40a6-913b-db20231447d4/runs?additionalRuns[]=16ff6f94-e8cd-430f-8b23-1938e28e86c5

aleksbykov avatar Aug 19 '24 14:08 aleksbykov

@soyacz @vponomaryov @fruch can you review?

aleksbykov avatar Sep 24 '24 11:09 aleksbykov

Fixed according comments. Latest run passed

aleksbykov avatar Sep 25 '24 13:09 aleksbykov

I've rebased and updated code for F821 rule

temichus avatar Sep 29 '24 12:09 temichus