Can't apply managementApiAuth to existing datacenter
What happened?
When I added
managementApiAuth:
manual:
clientSecretName: cass-management-api-client
serverSecretName: cass-management-api-server
to existing datacenter it failed to update sts because API is still running on http, but operator sends https requests:
2023-10-20T06:52:06.234Z ERROR controllers.CassandraDatacenter calculateReconciliationActions returned an error {"cassandradatacenter": "cass-db/dc2", "requestNamespace": "cass-db", "requestName": "dc2", "loopID": "14a47b0d-9f66-45c4-a330-9ef440c8754a", "error": "Post \"https://10.1.3.11:8080/api/v0/ops/seeds/reload\": http: server gave HTTP response to HTTPS client"}
github.com/k8ssandra/cass-operator/internal/controllers/cassandra.(*CassandraDatacenterReconciler).Reconcile
/workspace/internal/controllers/cassandra/cassandradatacenter_controller.go:149
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:122
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:323
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235
What did you expect to happen?
I expected operator to communicate with pods and update configuration to enable mTLS on both operator and mgmt api sides.
How can we reproduce it (as minimally and precisely as possible)?
- Bring up a basic datacenter with
managementApiAuth:
insecure: {}
- Once datacenter is up and running create certificates chain
### MGMT API Auth
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: cass-management-api-ca
spec:
isCA: true
commonName: cass-management-api
secretName: cass-management-api-ca
duration: 87600h0m0s # 10y
privateKey:
encoding: PKCS8
issuerRef:
name: cassandra-ca-issuer
kind: Issuer
group: cert-manager.io
---
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: cass-management-api-issuer
spec:
ca:
secretName: cass-management-api-ca
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: cass-management-api-server
spec:
secretName: cass-management-api-server
duration: 8760h0m0s # 1y
issuerRef:
name: cass-management-api-issuer
dnsNames:
- cass-management-api-server
commonName: cass-management-api
privateKey:
encoding: PKCS8
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: cass-management-api-client
spec:
secretName: cass-management-api-client
duration: 720h0m0s # 30d
issuerRef:
name: cass-management-api-issuer
commonName: cass-management-api-user
privateKey:
encoding: PKCS8
Update datacenter configuration:
managementApiAuth:
manual:
clientSecretName: cass-management-api-client
serverSecretName: cass-management-api-server
cass-operator version
1.17.2
Kubernetes version
1.27.4
Method of installation
Argo
Anything else we need to know?
No response
┆Issue is synchronized with this Jira Story by Unito ┆Issue Number: CASS-16
Similar issue happens if I build a datacenter with manual managementApiAuth and then disable it: it sends an http request to the https api endpoint
Actions to take here:
- Instead of checking CassandraDatacenter's status of the HTTPS, we need to check if ObservedGeneration < Generation of CassandraDatacenter and if the target Pod has HTTPS enabled or not and build the client security based on that information.
- The above can be checked from running configs by inspecting the Pod and/or if the target server secret is mounted there. Can be found from Pod's ENV variables: MGMT_API_TLS_CERT_FILE is set (or a volume called "management-api-server-certs-volume" is mounted to the pod)
@burmanm is this ticket still current? It is from last year so I'm confirming before I start work on it.
Also, can you clarify what you mean by check if ObservedGeneration < Generation of CassandraDatacenter? What ObserverGeneration are we checking? Why would the ObservedGeneration and "Generation" be different if you're referring to the CassandraDatacenter in both cases?
| @burmanm is this ticket still current? It is from last year so I'm confirming before I start work on it.
Yes, it is.
| Also, can you clarify what you mean by check if ObservedGeneration < Generation of CassandraDatacenter? What ObserverGeneration are we checking? Why would the ObservedGeneration and "Generation" be different if you're referring to the CassandraDatacenter in both cases?
Whenever you update the CassandraDatacenter, the ObservedGeneration is less than Generation until the reconciliation has finished. In this case, it wouldn't since it fails midway. However, I think you can just skip that check for now and simply verify if the target pod has https enabled or not for mgmt-api.
Currently, the syntax is func BuildManagementApiSecurityProvider(dc *api.CassandraDatacenter), yet this isn't sufficient. On the other hand, be careful that the SecurityProvider is the HTTPS one when constructing the PodTemplateSpec and other required modifications as enabling auth would require the https one to provide those curl commands and other properties. So it's not just the httphelper client.