redpanda
redpanda copied to clipboard
CI Failure (key symptom) in `RollingRestartTest.test_rolling_restart`
https://buildkite.com/redpanda/vtools/builds/13738
Module: rptest.redpanda_cloud_tests.rolling_restart_test
Class: RollingRestartTest
Method: test_rolling_restart
test_id: RollingRestartTest.test_rolling_restart
status: FAIL
run time: 1423.022 seconds
CalledProcessError(1, ['tsh', 'ssh', '--proxy=proxy.tp.redpanda.com:443', '--auth=okta', '--identity=/tmp/machine-id/identity', 'redpanda@cp24867ek221n77ef6tg-agent', 'kubectl', 'get', 'pods', '-n', 'redpanda', '-o', 'json'], '', '\x1b[31mERROR: \x1b[0mfailed connecting to host cp24867ek221n77ef6tg-agent:0: failed to receive cluster details response\n\tfailed to dial target host\n\tTeleport proxy failed to connect to "node" agent "@local-node" over reverse tunnel:\n\n ssh: unexpected packet in response to channel open: <nil>\n\nThis usually means that the agent is offline or has disconnected. Check the\nagent logs and, if the issue persists, try restarting it or re-registering it\nwith the cluster.\n\n')
Traceback (most recent call last):
File "/home/ubuntu/redpanda/tests/rptest/clients/kubectl.py", line 197, in _local_cmd
s_out, s_err = process.communicate(timeout=timeout)
File "/usr/lib/python3.10/subprocess.py", line 1154, in communicate
stdout, stderr = self._communicate(input, endtime, timeout)
File "/usr/lib/python3.10/subprocess.py", line 2022, in _communicate
self._check_timeout(endtime, orig_timeout, stdout, stderr)
File "/usr/lib/python3.10/subprocess.py", line 1198, in _check_timeout
raise TimeoutExpired(
subprocess.TimeoutExpired: Command '['tsh', 'ssh', '--proxy=proxy.tp.redpanda.com:443', '--auth=okta', '--identity=/tmp/machine-id/identity', 'redpanda@cp24867ek221n77ef6tg-agent', 'kubectl', 'delete', 'pod', 'rp-cp24867ek221n77ef6tg-3', '-n=redpanda']' timed out after 900 seconds
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ubuntu/redpanda/tests/rptest/services/cluster.py", line 103, in wrapped
r = f(self, *args, **kwargs)
File "/home/ubuntu/redpanda/tests/rptest/redpanda_cloud_tests/rolling_restart_test.py", line 35, in test_rolling_restart
self.redpanda.rolling_restart_pods()
File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 1778, in rolling_restart_pods
self.restart_pod(pod_name, pod_timeout)
File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 1749, in restart_pod
self.kubectl.cmd(delete_cmd)
File "/home/ubuntu/redpanda/tests/rptest/clients/kubectl.py", line 256, in cmd
return self._ssh_cmd(cmd, capture=capture)
File "/home/ubuntu/redpanda/tests/rptest/clients/kubectl.py", line 232, in _ssh_cmd
return self._local_cmd(local_cmd)
File "/home/ubuntu/redpanda/tests/rptest/clients/kubectl.py", line 205, in _local_cmd
raise subprocess.TimeoutExpired(cmd, timeout, s_out, s_err)
subprocess.TimeoutExpired: Command '['tsh', 'ssh', '--proxy=proxy.tp.redpanda.com:443', '--auth=okta', '--identity=/tmp/machine-id/identity', 'redpanda@cp24867ek221n77ef6tg-agent', 'kubectl', 'delete', 'pod', 'rp-cp24867ek221n77ef6tg-3', '-n=redpanda']' timed out after 900 seconds
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 184, in _do_run
data = self.run_test()
File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 276, in run_test
return self.test_context.function(self.test)
File "/home/ubuntu/redpanda/tests/rptest/services/cluster.py", line 126, in wrapped
redpanda.raise_on_crash(log_allow_list=log_allow_list)
File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 2084, in raise_on_crash
active, _, _ = self.get_redpanda_pods_presorted()
File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 1669, in get_redpanda_pods_presorted
all_pods = self.get_redpanda_pods()
File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 1697, in get_redpanda_pods
pods = json.loads(self.kubectl.cmd('get pods -n redpanda -o json'))
File "/home/ubuntu/redpanda/tests/rptest/clients/kubectl.py", line 256, in cmd
return self._ssh_cmd(cmd, capture=capture)
File "/home/ubuntu/redpanda/tests/rptest/clients/kubectl.py", line 232, in _ssh_cmd
return self._local_cmd(local_cmd)
File "/home/ubuntu/redpanda/tests/rptest/clients/kubectl.py", line 215, in _local_cmd
raise subprocess.CalledProcessError(process.returncode, cmd, s_out,
subprocess.CalledProcessError: Command '['tsh', 'ssh', '--proxy=proxy.tp.redpanda.com:443', '--auth=okta', '--identity=/tmp/machine-id/identity', 'redpanda@cp24867ek221n77ef6tg-agent', 'kubectl', 'get', 'pods', '-n', 'redpanda', '-o', 'json']' returned non-zero exit status 1.
JIRA Link: CORE-2975