scylla-operator
scylla-operator copied to clipboard
Errors like `alternator: get node info: no host config available` and `CQL: no host config available` when running `sctools status` after an update
What happened?
After an update of Scylla from 5.2.9 to 5.4.7, Scylla Operator from 1.9.x to 1.12.2 (latest that supports Scylla 5.2.x and 5.4.x), Scylla Manager from 3.1.x to 3.2.8, we started to observe that sctool status doesn't provide all the node info anymore and returns errors:
$ kubectl exec -it deployments/scylla-manager -n scylla-manager -- sctool status --cluster scylla/scylla
Datacenter: XXX
+----+-------------+-------------+----------+--------------+--------+------+--------+--------+-------+--------------------------------------+
| | Alternator | CQL | REST | Address | Uptime | CPUs | Memory | Scylla | Agent | Host ID |
+----+-------------+-------------+----------+--------------+--------+------+--------+--------+-------+--------------------------------------+
| UN | ERROR (0ms) | ERROR (0ms) | UP (0ms) | 10.7.241.130 | - | - | - | - | - | 8a24c600-5525-490e-a3cd-314f6062d6a1 |
| UN | ERROR (0ms) | ERROR (0ms) | UP (6ms) | 10.7.241.174 | - | - | - | - | - | f14fcd59-8d90-4d8e-af22-ace87ceced22 |
| UN | ERROR (0ms) | ERROR (0ms) | UP (1ms) | 10.7.241.175 | - | - | - | - | - | 050dcc67-7bb8-4d5d-89b1-5dbe0bcbb8b2 |
| UN | ERROR (0ms) | ERROR (0ms) | UP (5ms) | 10.7.243.109 | - | - | - | - | - | 4a3ff045-bba2-4537-a4d7-a213d25ae713 |
| UN | ERROR (0ms) | ERROR (0ms) | UP (1ms) | 10.7.248.124 | - | - | - | - | - | 028023f5-9d4e-404c-8537-467ac3d4538c |
| UN | ERROR (0ms) | ERROR (0ms) | UP (1ms) | 10.7.249.238 | - | - | - | - | - | b8f68c62-c462-4a30-a505-5ece9ae1ab0b |
| UN | ERROR (0ms) | ERROR (0ms) | UP (0ms) | 10.7.252.229 | - | - | - | - | - | 1ff1b8df-7a90-4321-a309-7cd69e20bd70 |
+----+-------------+-------------+----------+--------------+--------+------+--------+--------+-------+--------------------------------------+
Errors:
- 10.7.241.130 alternator: get node info: no host config available
- 10.7.241.130 CQL: no host config available
- 10.7.241.174 alternator: get node info: no host config available
- 10.7.241.174 CQL: no host config available
- 10.7.241.175 alternator: get node info: no host config available
- 10.7.241.175 CQL: no host config available
- 10.7.243.109 alternator: get node info: no host config available
- 10.7.243.109 CQL: no host config available
- 10.7.248.124 alternator: get node info: no host config available
- 10.7.248.124 CQL: no host config available
- 10.7.249.238 alternator: get node info: no host config available
- 10.7.249.238 CQL: no host config available
- 10.7.252.229 alternator: get node info: no host config available
- 10.7.252.229 CQL: no host config available
Note that our scylla.yaml didn't have any config for TLS up to that point.
This problem has been worked around by setting this:
client_encryption_options:
optional: true
However, we still have a problem with the Scylla Manager's cluster:
$ kubectl exec -it deployments/scylla-manager -n scylla-manager -- sctool status --cluster scylla-manager/scylla-manager
Datacenter: manager-dc
+----+-------------+-------------+-----------+--------------+--------+------+--------+--------+-------+--------------------------------------+
| | Alternator | CQL | REST | Address | Uptime | CPUs | Memory | Scylla | Agent | Host ID |
+----+-------------+-------------+-----------+--------------+--------+------+--------+--------+-------+--------------------------------------+
| UN | ERROR (0ms) | ERROR (0ms) | UP (92ms) | 10.7.255.190 | - | - | - | - | - | 8ec8a729-8225-4278-a9da-ad0f23f47e01 |
+----+-------------+-------------+-----------+--------------+--------+------+--------+--------+-------+--------------------------------------+
Errors:
- 10.7.255.190 alternator: get node info: no host config available
- 10.7.255.190 CQL: no host config available
...and it seems to only have a generated ConfigMap named scylladb-managed-config:
apiVersion: v1
data:
scylladb-managed-config.yaml: |
cluster_name: "scylla"
rpc_address: "0.0.0.0"
endpoint_snitch: "GossipingPropertyFileSnitch"
internode_compression: "all"
native_transport_port_ssl: 9142
native_shard_aware_transport_port_ssl: 19142
client_encryption_options:
enabled: true
optional: false
certificate: "/var/run/secrets/scylla-operator.scylladb.com/scylladb/serving-certs/tls.crt"
keyfile: "/var/run/secrets/scylla-operator.scylladb.com/scylladb/serving-certs/tls.key"
require_client_auth: true
truststore: "/var/run/secrets/scylla-operator.scylladb.com/scylladb/client-ca/tls.crt"
kind: ConfigMap
metadata:
annotations:
meta.helm.sh/release-name: scylla
meta.helm.sh/release-namespace: scylla
scylla-operator.scylladb.com/managed-hash: <redacted>
==
creationTimestamp: "<redacted>"
labels:
app.kubernetes.io/managed-by: Helm
scylla/cluster: scylla
name: scylla-managed-config
namespace: scylla
ownerReferences:
- apiVersion: scylla.scylladb.com/v1
blockOwnerDeletion: true
controller: true
kind: ScyllaCluster
name: scylla
uid: <redacted>
resourceVersion: "<redacted>"
uid: <redacted>
...and I can't find anything about modifying it in the https://operator.docs.scylladb.com/stable/helm.html...
Since then we have updated Scylla to 5.4.9, Operator to 1.13.0, and Manager to 3.3.0 but it did not help.
What did you expect to happen?
sctool status should work without errors for both main cluster as well as Scylla Manager's one after an update.
I shouldn't have to reconfigure TLS as the defaults shown in https://github.com/scylladb/scylladb/blob/scylla-5.4.7/conf/scylla.yaml#L472-L474 say that it should be disabled.
How can we reproduce it (as minimally and precisely as possible)?
- Set up versions like mentioned above
- Use this
scylla.yaml, as we had before:
read_request_timeout_in_ms: 5000
write_request_timeout_in_ms: 2000
cas_contention_timeout_in_ms: 1000
consistent_cluster_management: true
- Update to the versions mentioned above
- Check
sctool status
Scylla Operator version
1.13.0
Kubernetes platform name and version
$ kubectl version
Client Version: v1.29.6
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.5-gke.1192000
Please attach the must-gather archive.
scylla-operator-must-gather-77t6kvnghzss.zip
Anything else we need to know?
The must-gather archive has been anonymized additionally by me manually, see https://github.com/scylladb/scylla-operator/issues/2015.
This problem has originally been reported here https://github.com/scylladb/scylla-manager/issues/3889, but that issue was originally about a (probably?) different problem, so I was suggested to create a new one.
I am also seeing this in the scylladb-api-status-probe container logs of the Scylla pod:
I0712 14:17:47.251251 1 operator/cmd.go:21] maxprocs: Leaving GOMAXPROCS=[1]: CPU quota undefined
I0712 14:17:47.251718 1 probeserver/scylladbapistatus.go:133] scylladb-api-status version "v1.13.0-rc.0-2-g7f37771"
I0712 14:17:47.251740 1 flag/flags.go:64] FLAG: --address=""
I0712 14:17:47.251749 1 flag/flags.go:64] FLAG: --burst="75"
I0712 14:17:47.251754 1 flag/flags.go:64] FLAG: --feature-gates=""
I0712 14:17:47.251758 1 flag/flags.go:64] FLAG: --help="false"
I0712 14:17:47.251762 1 flag/flags.go:64] FLAG: --kubeconfig=""
I0712 14:17:47.251764 1 flag/flags.go:64] FLAG: --loglevel="2"
I0712 14:17:47.251767 1 flag/flags.go:64] FLAG: --namespace="scylla"
I0712 14:17:47.251770 1 flag/flags.go:64] FLAG: --port="8080"
I0712 14:17:47.251773 1 flag/flags.go:64] FLAG: --qps="50"
I0712 14:17:47.251777 1 flag/flags.go:64] FLAG: --service-name="scylla-us-west1-us-west1-b-0"
I0712 14:17:47.251780 1 flag/flags.go:64] FLAG: --v="2"
I0712 14:17:47.252016 1 cache/shared_informer.go:311] Waiting for caches to sync for Prober
I0712 14:17:47.258338 1 cache/reflector.go:351] Caches populated for *v1.Service from k8s.io/[email protected]/tools/cache/reflector.go:229
I0712 14:17:47.353007 1 cache/shared_informer.go:318] Caches are synced for Prober
I0712 14:17:47.353249 1 probeserver/serveprobes.go:78] "Starting probe server" Address=":8080"
E0712 14:17:55.645952 1 scylladbapistatus/prober.go:82] "readyz probe: can't get scylla node status" err="agent [HTTP 404] Not found" Service="scylla/scylla-us-west1-us-west1-b-0"
E0712 14:18:05.073105 1 scylladbapistatus/prober.go:101] "readyz probe: can't get scylla native transport" err="agent [HTTP 404] Not found" Service="scylla/scylla-us-west1-us-west1-b-0" Node="10.7.252.229"
E0712 14:18:14.999478 1 scylladbapistatus/prober.go:101] "readyz probe: can't get scylla native transport" err="agent [HTTP 404] Not found" Service="scylla/scylla-us-west1-us-west1-b-0" Node="10.7.252.229"
..and in the scylla-manager-agent container logs occasionally this:
{"L":"INFO","T":"2024-07-12T16:01:53.797Z","M":"http: TLS handshake error from 10.6.241.80:48086: EOF"}
{"L":"INFO","T":"2024-07-12T16:02:05.297Z","M":"http: TLS handshake error from 10.6.241.80:49746: read tcp 10.138.0.93:10001->10.6.241.80:49746: read: connection reset by peer"}
This is for both the main cluster and the Scylla Manager's cluster, although the former has the workaround applied.
Also in the scylla-operator Deployment I am seeing this in the logs:
I0712 14:14:38.692544 1 scyllacluster/status.go:36] "Status updated" ScyllaCluster="scylla/scylla"
I0712 14:14:48.696285 1 scyllacluster/status.go:29] "Updating status" ScyllaCluster="scylla/scylla"
I0712 14:14:48.709939 1 scyllacluster/status.go:36] "Status updated" ScyllaCluster="scylla/scylla"
E0712 14:14:52.702896 1 controllerhelpers/handlers.go:117] pod "scylla-us-west1-us-west1-b-2" not found
E0712 14:14:52.741036 1 nodeconfigpod/controller.go:291] syncing key 'scylla/scylla-us-west1-us-west1-b-2' failed: can't make configmap for pod "scylla/scylla-us-west1-us-west1-b-2": can't get container id: no scylla container found in pod "scylla/scylla-us-west1-us-west1-b-2"
E0712 14:14:52.746390 1 nodeconfigpod/controller.go:291] syncing key 'scylla/scylla-us-west1-us-west1-b-2' failed: can't make configmap for pod "scylla/scylla-us-west1-us-west1-b-2": can't get container id: no scylla container found in pod "scylla/scylla-us-west1-us-west1-b-2"
E0712 14:14:52.756784 1 nodeconfigpod/controller.go:291] syncing key 'scylla/scylla-us-west1-us-west1-b-2' failed: can't make configmap for pod "scylla/scylla-us-west1-us-west1-b-2": can't get container id: no scylla container found in pod "scylla/scylla-us-west1-us-west1-b-2"
I0712 14:14:52.777037 1 record/event.go:376] "Event occurred" object="scylla/nodeconfig-podinfo-fd90882c-f1e1-4050-ae6b-ef294b5d4cb5" fieldPath="" kind="ConfigMap" apiVersion="v1" type="Normal" reason="ConfigMapCreated" message="ConfigMap scylla/nodeconfig-podinfo-fd90882c-f1e1-4050-ae6b-ef294b5d4cb5 created"
I0712 14:15:03.717394 1 record/event.go:376] "Event occurred" object="scylla/nodeconfig-podinfo-fd90882c-f1e1-4050-ae6b-ef294b5d4cb5" fieldPath="" kind="ConfigMap" apiVersion="v1" type="Normal" reason="ConfigMapUpdated" message="ConfigMap scylla/nodeconfig-podinfo-fd90882c-f1e1-4050-ae6b-ef294b5d4cb5 updated"
I0712 14:15:08.716438 1 scyllacluster/status.go:29] "Updating status" ScyllaCluster="scylla/scylla"
I0712 14:15:08.725014 1 scyllacluster/controller.go:257] "Hit conflict, will retry in a bit" Key="scylla/scylla" Error="Operation cannot be fulfilled on scyllaclusters.scylla.scylladb.com \"scylla\": the object has been modified; please apply your changes to the latest version and try again"
I0712 14:15:18.729079 1 scyllacluster/status.go:29] "Updating status" ScyllaCluster="scylla/scylla"
I0712 14:15:18.743043 1 scyllacluster/status.go:36] "Status updated" ScyllaCluster="scylla/scylla"
I0712 14:15:28.746705 1 scyllacluster/status.go:29] "Updating status" ScyllaCluster="scylla/scylla"
I0712 14:15:28.755023 1 scyllacluster/controller.go:257] "Hit conflict, will retry in a bit" Key="scylla/scylla" Error="Operation cannot be fulfilled on scyllaclusters.scylla.scylladb.com \"scylla\": the object has been modified; please apply your changes to the latest version and try again"
I0712 14:15:58.765715 1 scyllacluster/status.go:29] "Updating status" ScyllaCluster="scylla/scylla"
I0712 14:15:58.773584 1 scyllacluster/controller.go:257] "Hit conflict, will retry in a bit" Key="scylla/scylla" Error="Operation cannot be fulfilled on scyllaclusters.scylla.scylladb.com \"scylla\": the object has been modified; please apply your changes to the latest version and try again"
E0712 14:16:16.164708 1 controllerhelpers/handlers.go:117] pod "scylla-us-west1-us-west1-b-1" not found
E0712 14:16:16.205687 1 nodeconfigpod/controller.go:291] syncing key 'scylla/scylla-us-west1-us-west1-b-1' failed: can't make configmap for pod "scylla/scylla-us-west1-us-west1-b-1": can't get container id: no scylla container found in pod "scylla/scylla-us-west1-us-west1-b-1"
E0712 14:16:16.210974 1 nodeconfigpod/controller.go:291] syncing key 'scylla/scylla-us-west1-us-west1-b-1' failed: can't make configmap for pod "scylla/scylla-us-west1-us-west1-b-1": can't get container id: no scylla container found in pod "scylla/scylla-us-west1-us-west1-b-1"
E0712 14:16:16.221368 1 nodeconfigpod/controller.go:291] syncing key 'scylla/scylla-us-west1-us-west1-b-1' failed: can't make configmap for pod "scylla/scylla-us-west1-us-west1-b-1": can't get container id: no scylla container found in pod "scylla/scylla-us-west1-us-west1-b-1"
I0712 14:16:16.241720 1 record/event.go:376] "Event occurred" object="scylla/nodeconfig-podinfo-7c4ac91a-f439-4869-8cc0-ad4f1fdfea81" fieldPath="" kind="ConfigMap" apiVersion="v1" type="Normal" reason="ConfigMapCreated" message="ConfigMap scylla/nodeconfig-podinfo-7c4ac91a-f439-4869-8cc0-ad4f1fdfea81 created"
I0712 14:16:28.783327 1 scyllacluster/status.go:29] "Updating status" ScyllaCluster="scylla/scylla"
I0712 14:16:28.797779 1 scyllacluster/status.go:36] "Status updated" ScyllaCluster="scylla/scylla"
I0712 14:16:29.192472 1 record/event.go:376] "Event occurred" object="scylla/nodeconfig-podinfo-7c4ac91a-f439-4869-8cc0-ad4f1fdfea81" fieldPath="" kind="ConfigMap" apiVersion="v1" type="Normal" reason="ConfigMapUpdated" message="ConfigMap scylla/nodeconfig-podinfo-7c4ac91a-f439-4869-8cc0-ad4f1fdfea81 updated"
I0712 14:17:18.817963 1 scyllacluster/status.go:29] "Updating status" ScyllaCluster="scylla/scylla"
I0712 14:17:18.826596 1 scyllacluster/controller.go:257] "Hit conflict, will retry in a bit" Key="scylla/scylla" Error="Operation cannot be fulfilled on scyllaclusters.scylla.scylladb.com \"scylla\": the object has been modified; please apply your changes to the latest version and try again"
E0712 14:17:34.627808 1 controllerhelpers/handlers.go:117] pod "scylla-us-west1-us-west1-b-0" not found
E0712 14:17:34.675797 1 nodeconfigpod/controller.go:291] syncing key 'scylla/scylla-us-west1-us-west1-b-0' failed: can't make configmap for pod "scylla/scylla-us-west1-us-west1-b-0": can't get container id: no scylla container found in pod "scylla/scylla-us-west1-us-west1-b-0"
E0712 14:17:34.681062 1 nodeconfigpod/controller.go:291] syncing key 'scylla/scylla-us-west1-us-west1-b-0' failed: can't make configmap for pod "scylla/scylla-us-west1-us-west1-b-0": can't get container id: no scylla container found in pod "scylla/scylla-us-west1-us-west1-b-0"
E0712 14:17:34.691344 1 nodeconfigpod/controller.go:291] syncing key 'scylla/scylla-us-west1-us-west1-b-0' failed: can't make configmap for pod "scylla/scylla-us-west1-us-west1-b-0": can't get container id: no scylla container found in pod "scylla/scylla-us-west1-us-west1-b-0"
E0712 14:17:34.711688 1 nodeconfigpod/controller.go:291] syncing key 'scylla/scylla-us-west1-us-west1-b-0' failed: can't make configmap for pod "scylla/scylla-us-west1-us-west1-b-0": can't get container id: no scylla container found in pod "scylla/scylla-us-west1-us-west1-b-0"
I0712 14:17:34.755596 1 record/event.go:376] "Event occurred" object="scylla/nodeconfig-podinfo-784f0acf-f384-4efb-b2af-4dfbeecaf684" fieldPath="" kind="ConfigMap" apiVersion="v1" type="Normal" reason="ConfigMapCreated" message="ConfigMap scylla/nodeconfig-podinfo-784f0acf-f384-4efb-b2af-4dfbeecaf684 created"
I0712 14:17:48.651971 1 record/event.go:376] "Event occurred" object="scylla/nodeconfig-podinfo-784f0acf-f384-4efb-b2af-4dfbeecaf684" fieldPath="" kind="ConfigMap" apiVersion="v1" type="Normal" reason="ConfigMapUpdated" message="ConfigMap scylla/nodeconfig-podinfo-784f0acf-f384-4efb-b2af-4dfbeecaf684 updated"
I0712 14:17:48.834780 1 scyllacluster/status.go:29] "Updating status" ScyllaCluster="scylla/scylla"
I0712 14:17:48.849950 1 scyllacluster/status.go:36] "Status updated" ScyllaCluster="scylla/scylla"
I0712 14:18:38.865922 1 scyllacluster/status.go:29] "Updating status" ScyllaCluster="scylla/scylla"
- maybe it's related?
We are also seeing disk usage constantly growing on all the nodes since the update, although our cluster usage has not changed, but apart from that the cluster itself seems to be working rather normally.
(I reported this issue separately here https://github.com/scylladb/scylladb/issues/19793 as I don't think it's related this this one.)
Scylla Operator from 1.9.x to 1.12.2
scylla operator only supports n+1 upgrades, otherwise you may miss a migration step
Alternator should be configured through the API, see:
- https://operator.docs.scylladb.com/stable/clients/alternator.html
- https://operator.docs.scylladb.com/stable/api-reference/groups/scylla.scylladb.com/scyllaclusters.html#api-scylla-scylladb-com-scyllaclusters-v1-spec-alternator
Is the Alternator API working on it's own? I'd expect you need to take some extra steps to configure the certificates with it. For the manager integration CQL and Alternator certs are not supported yet.
Scylla Operator from 1.9.x to 1.12.2
scylla operator only supports n+1 upgrades, otherwise you may miss a migration step
Oh, got it now but I didn't do it this way as it was not documented at https://operator.docs.scylladb.com/stable/upgrade.html...
But it's a fact that I forgot about CRD updates completely. 😞
Alternator should be configured through the API, see:
* https://operator.docs.scylladb.com/stable/clients/alternator.html * https://operator.docs.scylladb.com/stable/api-reference/groups/scylla.scylladb.com/scyllaclusters.html#api-scylla-scylladb-com-scyllaclusters-v1-spec-alternatorIs the Alternator API working on it's own? I'd expect you need to take some extra steps to configure the certificates with it. For the manager integration CQL and Alternator certs are not supported yet.
We are not using Alternator.
How to fix this now, @tnozicka? Should I apply CRDs from the each version 1.10.<latest_patch>, 1.11.<latest_patch>, ..., 1.13.<latest_patch> as documented in the 2nd step of https://operator.docs.scylladb.com/stable/upgrade.html#upgrade-via-helm?
Oh, got it now but I didn't do it this way as it was not documented at https://operator.docs.scylladb.com/stable/upgrade.html
It only shows the X.Y.Z to X.Y+1.Z upgrades https://operator.docs.scylladb.com/stable/upgrade.html#v1-2-0-v1-3-0 but I though we had it in some place generically too
How to fix this now
Rollback the operator deployment manifest an image back to where it started and follow the upgrade guide for each Y+1 from there (operator + CRD + wait for rollouts for each bump)
Oh, got it now but I didn't do it this way as it was not documented at https://operator.docs.scylladb.com/stable/upgrade.html
It only shows the X.Y.Z to X.Y+1.Z upgrades https://operator.docs.scylladb.com/stable/upgrade.html#v1-2-0-v1-3-0 but I though we had it in some place generically too
Oh, you were right, in https://operator.docs.scylladb.com/stable/upgrade.html#upgrade-via-helm there is a step with the CRD updates. 🤦♂️ Sorry!
How to fix this now
Rollback the operator deployment manifest an image back to where it started and follow the upgrade guide for each Y+1 from there (operator + CRD + wait for rollouts for each bump)
We did this but I am still seeing:
$ kubectl exec -it deployments/scylla-manager -n scylla-manager -- sctool status --cluster scylla-manager/scylla-manager
Datacenter: manager-dc
+----+-------------+-------------+----------+--------------+--------+------+--------+--------+-------+--------------------------------------+
| | Alternator | CQL | REST | Address | Uptime | CPUs | Memory | Scylla | Agent | Host ID |
+----+-------------+-------------+----------+--------------+--------+------+--------+--------+-------+--------------------------------------+
| UN | ERROR (0ms) | ERROR (0ms) | UP (0ms) | 10.7.255.190 | - | - | - | - | - | 8ec8a729-8225-4278-a9da-ad0f23f47e01 |
+----+-------------+-------------+----------+--------------+--------+------+--------+--------+-------+--------------------------------------+
Errors:
- 10.7.255.190 alternator: get node info: no host config available
- 10.7.255.190 CQL: no host config available
What's next?
We have updated the Scylla Manager to 3.3.1 and we are still having this problem.
I don't really care that much about the ugly output of sctool status for the Scylla Manager's cluster, but we also see that an update to the Scylla Manager's cluster tasks configs is not working, perhaps because of this.
The appropriate logs of scylla-manager-controller from the scylla-manager-controller Deployment:
E0827 09:34:57.898551 1 manager/controller.go:154] syncing key 'scylla-manager/scylla-manager' failed: can't execute action: can't update task "manager-daily-backup": [PUT /cluster/{cluster_id}/task/{task_type}/{task_id}][404] PutClusterClusterIDTaskTaskTypeTaskID default &{Details: Message:get resource: create backup target: create cluster session: TLS/SSL key/cert is not registered: not found TraceID:s-j603PPTLC2kyO2xXY6hA} E0827 09:34:58.328620 1 manager/sync.go:136] "Failed to execute action" err="can't update task \"manager-daily-backup\": [PUT /cluster/{cluster_id}/task/{task_type}/{task_id}][404] PutClusterClusterIDTaskTaskTypeTaskID default &{Details: Message:get resource: create backup target: create cluster session: TLS/SSL key/cert is not registered: not found TraceID:Ii-0uK49T3-K70PTOVmD5Q}" action="update task &{ClusterID: Enabled:true ID:0db86eed-6ec-4aa2-879d-05e1b84fb428 Name:manager-daily-backup Properties:map[dc:[manager-dc] location:[gcs:fetlife-scylla-manager-backups] retention:7] Schedule:0xc000213dc0 Tags:[] Type:backup}"
The backups themselves are not working too:
$ kubectl exec -it deployments/scylla-manager -n scylla-manager -- sctool tasks --cluster scylla-manager/scylla-manage
r
+------------------------------+--------+----------+--------+----------+---------+-------+------------------------+------------------------+--------+------------------------+
| Task | Labels | Schedule | Window | Timezone | Success | Error | Last Success | Last Error | Status | Next |
+------------------------------+--------+----------+--------+----------+---------+-------+------------------------+------------------------+--------+------------------------+
| backup/manager-daily-backup | | 1d | | | 658 | 60 | 12 Jul 24 11:00:23 UTC | 01 Sep 24 11:00:00 UTC | ERROR | 02 Sep 24 11:00:00 UTC |
| healthcheck/rest | | 1m | | | 1093493 | 0 | 02 Sep 24 08:28:56 UTC | | DONE | 02 Sep 24 08:29:56 UTC |
| healthcheck/alternator | | 15s | | | 4373968 | 1 | 02 Sep 24 08:29:26 UTC | 17 Apr 23 02:15:41 UTC | DONE | 02 Sep 24 08:29:41 UTC |
| healthcheck/cql | | 15s | | | 4373936 | 1 | 02 Sep 24 08:29:26 UTC | 17 Apr 23 02:15:41 UTC | DONE | 02 Sep 24 08:29:41 UTC |
| repair/manager-weekly-repair | | 7d | | | 101 | 0 | 31 Aug 24 11:30:02 UTC | | DONE | 07 Sep 24 11:30:00 UTC |
+------------------------------+--------+----------+--------+----------+---------+-------+------------------------+------------------------+--------+------------------------+
$ kubectl exec -it deployments/scylla-manager -n scylla-manager -- sctool progress --cluster scylla-manager/scylla-ma
nager backup/manager-daily-backup
Run: 550809c3-6851-11ef-a3b5-b2c3114a5b19
Status: ERROR (initialising)
Cause: get backup target: create cluster session: TLS/SSL key/cert is not registered: not found
Start time: 01 Sep 24 11:00:00 UTC
End time: 01 Sep 24 11:00:00 UTC
Duration: 0s
Progress: -
The Scylla Operator project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 30d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out
/lifecycle stale
The Scylla Operator project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 30d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out
/lifecycle rotten
We don't have this cluster anymore and don't observe this issue with a new one.