redpanda icon indicating copy to clipboard operation
redpanda copied to clipboard

Failed to stop node in `test_cancelling_partition_move_x_core`

Open dotnwat opened this issue 1 year ago • 0 comments

Seeing quite a few of these failed to stop no in 30 seconds running test_cancelling_partition_move_x_core test.

Here is an example run https://buildkite.com/redpanda/redpanda/builds/14116#01829956-4db0-4e72-83c9-93b87fd21e39

Looking at redpanda logs there does seem to be some shutdown sequence that has initiated, but the logs cut off and it looks like it hung and then was killed.

test_id:    rptest.tests.partition_move_interruption_test.PartitionMoveInterruption.test_cancelling_partition_move_x_core.replication_factor=3.unclean_abort=False.recovery=restart_recovery
--
  | status:     FAIL
  | run time:   7 minutes 38.944 seconds
  |  
  |  
  | TimeoutError('Redpanda node docker-rp-12 failed to stop in 30 seconds')
  | Traceback (most recent call last):
  | File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 135, in run
  | data = self.run_test()
  | File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 227, in run_test
  | return self.test_context.function(self.test)
  | File "/usr/local/lib/python3.10/dist-packages/ducktape/mark/_mark.py", line 476, in wrapper
  | return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
  | File "/root/tests/rptest/services/cluster.py", line 35, in wrapped
  | r = f(self, *args, **kwargs)
  | File "/root/tests/rptest/tests/partition_move_interruption_test.py", line 154, in test_cancelling_partition_move_x_core
  | self.redpanda.restart_nodes(
  | File "/root/tests/rptest/services/redpanda.py", line 1394, in restart_nodes
  | self.stop_node(node, timeout=stop_timeout)
  | File "/root/tests/rptest/services/redpanda.py", line 1201, in stop_node
  | wait_until(
  | File "/usr/local/lib/python3.10/dist-packages/ducktape/utils/util.py", line 58, in wait_until
  | raise TimeoutError(err_msg() if callable(err_msg) else err_msg) from last_exception
  | ducktape.errors.TimeoutError: Redpanda node docker-rp-12 failed to stop in 30 seconds

dotnwat avatar Aug 14 '22 00:08 dotnwat