scylla-cluster-tests icon indicating copy to clipboard operation
scylla-cluster-tests copied to clipboard

hydra clean-resources give error when (only) SCT_SCYLLA_VERSION is set

Open cezarmoise opened this issue 1 year ago • 5 comments

Running

export SCT_SCYLLA_VERSION=5.2.1
hydra run-test ...
# then later
hydra clean-resources --user `whoami` -b aws

Gives this error

There is scylladb/hydra:v1.74-tcconfig-0.29.1 in local cache, using it.
Going to run './sct.py  clean-resources --user cezar.moise -b aws'...
/usr/local/lib/python3.10/site-packages/paramiko/pkey.py:100: CryptographyDeprecationWarning: TripleDES has been moved to cryptography.hazmat.decrepit.ciphers.algorithms.TripleDES and will be removed from this module in 48.0.0.
  "cipher": algorithms.TripleDES,
/usr/local/lib/python3.10/site-packages/paramiko/transport.py:259: CryptographyDeprecationWarning: TripleDES has been moved to cryptography.hazmat.decrepit.ciphers.algorithms.TripleDES and will be removed from this module in 48.0.0.
  "class": algorithms.TripleDES,
logged in as arn:aws:sts::797456418907:assumed-role/DeveloperAccessRole/[email protected]
New directory created: /home/cezar/sct-results/20240813-102128-166284-clean-resources
Clean all resources belong to user `cezar.moise'
Failed to load configuration files: []
The run can be interrupted by following critical events:
  * ClusterHealthValidatorEvent.NodeStatus
  * ClusterHealthValidatorEvent.ScyllaCloudClusterServerDiagnostic
  * DataValidatorEvent.UpdatedRowsValidator
  * DatabaseLogEvent.CORRUPTED_SSTABLE
  * CassandraHarryEvent.failure
  * YcsbStressEvent.failure
  * NdBenchStressEvent.failure
  * NdBenchErrorEvent.BuildFailed
  * NdBenchErrorEvent.Failure
  * CDCReaderStressEvent.failure
  * NoSQLBenchStressEvent
  * NoSQLBenchStressLogEvents.ProgressIndicatorStoppedEvent
  * CassandraStressLogEvent.OperationOnKey
  * CqlStressCassandraStressLogEvent.ReadValidationError
  * ScyllaBenchLogEvent.DataValidationError
  * ScyllaBenchLogEvent.ParseDistributionError
  * GeminiStressLogEvent.GeminiEvent
  * PrometheusAlertManagerEvent
  * TestTimeoutEvent
  * TestFrameworkEvent
  * SpotTerminationEvent
  * ScyllaBenchEvent
  * CassandraStressEvent
  * CqlStressCassandraStressEvent
  * LatteStressEvent
  * GeminiStressEvent
  * HWPerforanceEvent
  * PartitionRowsValidationEvent


Traceback (most recent call last):
  File "/home/cezar/Documents/github/cezarmoise/scylla-cluster-tests/./sct.py", line 1863, in <module>
    cli.main(prog_name="hydra")
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/cezar/Documents/github/cezarmoise/scylla-cluster-tests/./sct.py", line 324, in clean_resources
    config = SCTConfiguration()
  File "/home/cezar/Documents/github/cezarmoise/scylla-cluster-tests/sdcm/sct_config.py", line 1809, in __init__
    aws_arch = get_arch_from_instance_type(self.get('instance_type_db'), region_name=region)
  File "/home/cezar/Documents/github/cezarmoise/scylla-cluster-tests/sdcm/utils/aws_utils.py", line 449, in get_arch_from_instance_type
    instance_type_info = client.describe_instance_types(InstanceTypes=[instance_type])
  File "/usr/local/lib/python3.10/site-packages/botocore/client.py", line 534, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/lib/python3.10/site-packages/botocore/client.py", line 935, in _make_api_call
    request_dict = self._convert_to_request_dict(
  File "/usr/local/lib/python3.10/site-packages/botocore/client.py", line 1003, in _convert_to_request_dict
    request_dict = self._serializer.serialize_to_request(
  File "/usr/local/lib/python3.10/site-packages/botocore/validate.py", line 381, in serialize_to_request
    raise ParamValidationError(report=report.generate_report())
botocore.exceptions.ParamValidationError: Parameter validation failed:
Invalid type for parameter InstanceTypes[0], value: None, type: <class 'NoneType'>, valid types: <class 'str'>

Other backends like gce don't have this issue

Going to get all instances from GCE
Done. Found total of 0 instances.
There are no GCE instances to remove in the gcp-sct-project-1 project
Cleanup for the {'RunByUser': 'cezar.moise', 'CreatedBy': 'SCT'} resources has been finished

cezarmoise avatar Aug 13 '24 13:08 cezarmoise

It's something with your env. I don't know yet what. Do you have any changes in /home/cezar/Documents/github/cezarmoise/scylla-cluster-tests ?

soyacz avatar Aug 13 '24 13:08 soyacz

It's something with your env. I don't know yet what. Do you have any changes in /home/cezar/Documents/github/cezarmoise/scylla-cluster-tests ?

No changes. It also fails if resources are there to be stopped.

I managed to track the issue

Apparently, if I have previously set SCT_SCYLLA_VERSION=5.21 in my env, it causes that error. Unsetting that var fixes it.

cezarmoise avatar Aug 13 '24 15:08 cezarmoise

It's something with your env. I don't know yet what. Do you have any changes in /home/cezar/Documents/github/cezarmoise/scylla-cluster-tests ?

No changes. It also fails if resources are there to be stopped.

I managed to track the issue

Apparently, if I have previously set SCT_SCYLLA_VERSION=5.21 in my env, it causes that error. Unsetting that var fixes it.

ok, please adjust the title and description to your's findings

soyacz avatar Aug 14 '24 06:08 soyacz

from the QA chat:

Yulia Yakovlev, Tue 6:15 PM
 also received the error. As I see from the code, if scylla_version is set, it starts to get arch from instance type. But instance type is not defined

Yulia Yakovlev, Tue 6:15 PM, Edited
File "/home/juliayakovlev/scylla_repo/scylla-cluster-tests/sdcm/sct_config.py", line 1820, in __init__
    aws_arch = get_arch_from_instance_type(self.get('instance_type_db'), region_name=region)
The error is Invalid type for parameter InstanceTypes[0], value: None, type: <class 'NoneType'>, valid types: <class 'str'>
I suppose that instance_type_db is None

You, Tue 6:25 PM
hydra clean-resources was change during the last year, to be work based on test configuration, hence any SCT environment variable can affect it
in you case you passed one specific parameter, and not a full test case
hence the missing instance_type_db

You, Tue 6:28 PM
sct_config.py can be refactor that those steps checks more of parameters it's using before using it
also maybe all of the resolving logic, can be moving into it's own functions, and not always used.
i.e. maybe cleanup (or other places) doesn't need complete resolved configuration

fruch avatar Aug 15 '24 05:08 fruch

to sum it up, both sct_config.py and get_arch_from_instance_type can be change to be safer and have sane defaults

fruch avatar Aug 15 '24 05:08 fruch