scylla-cluster-tests
scylla-cluster-tests copied to clipboard
Solution for a problem: SL does not get resources on the node that was added during `disrupt_add_remove_dc`
disrupt_sla_decrease_shares_during_load
ran in parallel with disrupt_add_remove_dc
.
disrupt_add_remove_dc
nemesis adds a new node in a new DC and the removes it.
disrupt_sla_decrease_shares_during_load
failed with error:
(Node 10.4.3.76) - Service level sl:sl500_f87475aa did not get resources unexpectedly. CPU%: 0.2
The node "10.4.3.76" is new added node. Service level did not get resources because of node did not get load (it is the nemesis).
Sam with next running disrupt_sla_increase_shares_during_load
nemesis
I can add the validation that no load on the node. But the question here: if it (no load on the node) is not expected and it is the problem - we want to report it as problem
What is the problem here
disrupt_add_remove_dc
nemesis add node in the new DC.
Keyspace for SLA test case (namesis) is created with replication(strategy=NetworkTopologyStrategy,replication_factor=3)
. As result the load will not run on the new added node because it is located in new DC.
Possible solution Ignore this node during validation. How to understand that this is the case - the way needs be found
Argus: https://argus.scylladb.com/test/1aebcb86-a767-4ce5-a88a-c977ab077ddc/runs?additionalRuns%5B%5D=1e57a6c9-e21e-40e4-ba8f-2aa6d677438f