redpanda
redpanda copied to clipboard
CI Failure (`leader count on shard (2, 0) (3) is < 4`) in `AutomaticLeadershipBalancingTest.test_automatic_rebalance`
https://buildkite.com/redpanda/vtools/builds/12307
Module: rptest.tests.leadership_transfer_test
Class: AutomaticLeadershipBalancingTest
Method: test_automatic_rebalance
test_id: AutomaticLeadershipBalancingTest.test_automatic_rebalance
status: FAIL
run time: 91.740 seconds
AssertionError('leader count on shard (2, 0) (3) is < 4')
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 184, in _do_run
data = self.run_test()
File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 269, in run_test
return self.test_context.function(self.test)
File "/home/ubuntu/redpanda/tests/rptest/services/cluster.py", line 104, in wrapped
r = f(self, *args, **kwargs)
File "/home/ubuntu/redpanda/tests/rptest/tests/leadership_transfer_test.py", line 324, in test_automatic_rebalance
assert count >= expected_min, \
AssertionError: leader count on shard (2, 0) (3) is < 4
JIRA Link: CORE-1883
*https://buildkite.com/redpanda/vtools/builds/12321
*https://buildkite.com/redpanda/vtools/builds/12578 *https://buildkite.com/redpanda/redpanda/builds/47080
*https://buildkite.com/redpanda/vtools/builds/12592
*https://buildkite.com/redpanda/redpanda/builds/47264
*https://buildkite.com/redpanda/redpanda/builds/47264
*https://buildkite.com/redpanda/redpanda/builds/47372 *https://buildkite.com/redpanda/vtools/builds/12679
*https://buildkite.com/redpanda/redpanda/builds/47416 *https://buildkite.com/redpanda/redpanda/builds/47612
*https://buildkite.com/redpanda/redpanda/builds/47756
*https://buildkite.com/redpanda/vtools/builds/12865 *https://buildkite.com/redpanda/vtools/builds/12866
*https://buildkite.com/redpanda/redpanda/builds/47786
*https://buildkite.com/redpanda/redpanda/builds/47838 *https://buildkite.com/redpanda/redpanda/builds/47843 *https://buildkite.com/redpanda/vtools/builds/12890
*https://buildkite.com/redpanda/vtools/builds/12994 *https://buildkite.com/redpanda/redpanda/builds/48064 *https://buildkite.com/redpanda/redpanda/builds/48086 *https://buildkite.com/redpanda/redpanda/builds/48099
The test is failing because a newly restarted node is sending an incomplete health report (some partitions haven't started yet). I guess we can give newly restarted nodes a grace period and mute them for a bit before transferring leadership there (a good idea for other reasons as well).
*https://buildkite.com/redpanda/vtools/builds/13100
*https://buildkite.com/redpanda/vtools/builds/13147
*https://buildkite.com/redpanda/vtools/builds/13189
*https://buildkite.com/redpanda/vtools/builds/13219 *https://buildkite.com/redpanda/vtools/builds/13228
*https://buildkite.com/redpanda/redpanda/builds/48366
*https://buildkite.com/redpanda/redpanda/builds/48606
*https://buildkite.com/redpanda/redpanda/builds/48666 *https://buildkite.com/redpanda/redpanda/builds/48667
*https://buildkite.com/redpanda/redpanda/builds/48717
*https://buildkite.com/redpanda/vtools/builds/13497
*https://buildkite.com/redpanda/vtools/builds/13656
*https://buildkite.com/redpanda/redpanda/builds/48992 *https://buildkite.com/redpanda/redpanda/builds/48990 *https://buildkite.com/redpanda/redpanda/builds/49002
*https://buildkite.com/redpanda/redpanda/builds/49075
*https://buildkite.com/redpanda/redpanda/builds/49143 *https://buildkite.com/redpanda/vtools/builds/13742
*https://buildkite.com/redpanda/redpanda/builds/49202 *https://buildkite.com/redpanda/vtools/builds/13769 *https://buildkite.com/redpanda/vtools/builds/13773 *https://buildkite.com/redpanda/vtools/builds/13892
*https://buildkite.com/redpanda/redpanda/builds/49468
*https://buildkite.com/redpanda/redpanda/builds/49567
*https://buildkite.com/redpanda/vtools/builds/14105