scylla-cluster-tests
scylla-cluster-tests copied to clipboard
DecommissionSeedNode nemesis is failing on Docker backend
DecommissionSeedNode test case is failing on Docker backend with the error:
Last events by severity
CRITICAL - [60]
2024-04-08 12:52:28.274: (CassandraStressLogEvent Severity.CRITICAL) period_type=one-time event_id=d80834b5-1f7f-43e7-9ba7-c0371561612c during_nemesis=NodetoolSeedDecommission: type=OperationOnKey regex=Operation x10 on key\(s\) \[ line_number=559 node=Node longevity-1gb-1h-nemesis-longevit-loader-node-c3eea640-0 [172.17.0.5 | 172.17.0.5] (seed: False)
java.io.IOException: Operation x10 on key(s) [3537324c3436314d3330]: Error executing: (UnavailableException): Not enough replicas available for query at consistency QUORUM (2 required but only 1 alive)
...
ERROR - [4]
2024-04-08 12:29:16.518: (NodetoolEvent Severity.ERROR) period_type=end event_id=4d315db6-4735-4c54-8af4-6e00faf5209e during_nemesis=NodetoolSeedDecommission duration=2s: nodetool_command=cleanup node=longevity-1gb-1h-nemesis-longevit-db-node-c3eea640-0 errors=["Encountered a bad command exit code!\n\nCommand: '/usr/bin/nodetool cleanup scylla_bench'\n\nExit code: 1\n\nStdout:\n\nnodetool: Keyspace [scylla_bench] does not exist.\nSee 'nodetool help' or 'nodetool help <command>'.\n\nStderr:\n\n\n\n", 'Traceback (most recent call last):\n File "/home/ubuntu/scylla-cluster-tests/sdcm/cluster.py", line 2529, in run_nodetool\n self.remoter.run(cmd, timeout=timeout, ignore_status=ignore_status, verbose=verbose, retry=retry)\n File "/home/ubuntu/scylla-cluster-tests/sdcm/remote/remote_base.py", line 614, in run\n result = _run()\n File "/home/ubuntu/scylla-cluster-tests/sdcm/utils/decorators.py", line 65, in inner\n return func(*args, **kwargs)\n File "/home/ubuntu/scylla-cluster-tests/sdcm/remote/remote_base.py", line 605, in _run\n return self._run_execute(cmd, timeout, ignore_status, verbose, new_session, watchers)\n File "/home/ubuntu/scylla-cluster-tests/sdcm/remote/remote_base.py", line 538, in _run_execute\n result = connection.run(**command_kwargs)\n File "/home/ubuntu/scylla-cluster-tests/sdcm/remote/libssh2_client/__init__.py", line 620, in run\n return self._complete_run(channel, exception, timeout_reached, timeout, result, warn, stdout, stderr)\n File "/home/ubuntu/scylla-cluster-tests/sdcm/remote/libssh2_client/__init__.py", line 655, in _complete_run\n raise UnexpectedExit(result)\nsdcm.remote.libssh2_client.exceptions.UnexpectedExit: Encountered a bad command exit code!\n\nCommand: \'/usr/bin/nodetool cleanup scylla_bench\'\n\nExit code: 1\n\nStdout:\n\nnodetool: Keyspace [scylla_bench] does not exist.\nSee \'nodetool help\' or \'nodetool help <command>\'.\n\nStderr:\n\n\n\n\n']
Traceback (most recent call last):
File "/home/ubuntu/scylla-cluster-tests/sdcm/cluster.py", line 2529, in run_nodetool
self.remoter.run(cmd, timeout=timeout, ignore_status=ignore_status, verbose=verbose, retry=retry)
File "/home/ubuntu/scylla-cluster-tests/sdcm/remote/remote_base.py", line 614, in run
result = _run()
File "/home/ubuntu/scylla-cluster-tests/sdcm/utils/decorators.py", line 65, in inner
return func(*args, **kwargs)
File "/home/ubuntu/scylla-cluster-tests/sdcm/remote/remote_base.py", line 605, in _run
return self._run_execute(cmd, timeout, ignore_status, verbose, new_session, watchers)
File "/home/ubuntu/scylla-cluster-tests/sdcm/remote/remote_base.py", line 538, in _run_execute
result = connection.run(**command_kwargs)
File "/home/ubuntu/scylla-cluster-tests/sdcm/remote/libssh2_client/__init__.py", line 620, in run
return self._complete_run(channel, exception, timeout_reached, timeout, result, warn, stdout, stderr)
File "/home/ubuntu/scylla-cluster-tests/sdcm/remote/libssh2_client/__init__.py", line 655, in _complete_run
raise UnexpectedExit(result)
sdcm.remote.libssh2_client.exceptions.UnexpectedExit: Encountered a bad command exit code!
Command: '/usr/bin/nodetool cleanup scylla_bench'
Exit code: 1
Stdout:
nodetool: Keyspace [scylla_bench] does not exist.
See 'nodetool help' or 'nodetool help <command>'.
Stderr:
...
Installation details
SCT Version: master Scylla version (or git commit hash): 2024.1.2
Logs
- job log: https://jenkins.scylladb.com/view/staging/job/scylla-staging/job/dimakr/job/longevity-5gb-1h-docker-test/14/
looks like 2 nodes were down (or overloaded cluster), decommission should work. Can you share argus link?
@soyacz There was no already a build in Jenkins as it was rotated, and probably because of that no build in Argus. I re-executed the Nemesis - Jenkins build Still don't see the build in Argus.
that's because configurations/nemesis/additional_configs/docker_backend_local.yaml was used which contains:
# TODO: remove this when we'll run this in jenkins
enable_argus: false
is it fixed for jobs in jenkins?
is it fixed for jobs in jenkins? @soyacz Right, this is disabled for now in the master, until the change to enable executing docker backend in Jenkins is merged (and I was using my own branch to run the tests in Jenkins).