awx-operator
awx-operator copied to clipboard
LDAP Certificate not loading
I have a .crt file for an internal CA that I can call against an internal resource, using curl --cafile ca.crt ldaps://xyz:636
, and that works in terms of verifying the certificate. However, adding it as a secret and referencing it in the manifest continues to show errors. Running this in k3s.
Creating the secret
kubectl -n awx create secret generic awx-custom-certs --from-file=ldap-ca.crt=./ca.crt --from-file=bundle-ca.crt=./ca.crt
secret/awx-custom-certs created
Basic deploy
---
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
name: awx
namespace: awx
spec:
service_type: nodeport
ldap_cacert_secret: awx-custom-certs
bundle_cacert_secret: awx-custom-certs
hostname: xxx
web_resource_requirements: {}
ee_resource_requirements: {}
task_resource_requirements: {}
After deploy I see this in the manager logs, no errors:
kubectl -n awx logs deployments/awx-operator-controller-manager -c manager
PLAY RECAP *********************************************************************
localhost : ok=58 changed=0 unreachable=0 failed=0 skipped=37 rescued=0 ignored=0
awx-custom-certs in same namespace, again expected since I didn't get any errors from the operator.
kubectl -n awx get awx,all,ingress,secrets,persistentvolume
NAME AGE
awx.awx.ansible.com/awx 5m55s
NAME READY STATUS RESTARTS AGE
pod/awx-operator-controller-manager-68d787cfbd-fnv9c 2/2 Running 0 7m5s
pod/awx-postgres-0 1/1 Running 0 5m36s
pod/awx-559fcd895-tfxl9 4/4 Running 0 5m27s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/awx-operator-controller-manager-metrics-service ClusterIP 10.43.6.215 <none> 8443/TCP 7m5s
service/awx-postgres ClusterIP None <none> 5432/TCP 5m36s
service/awx-service NodePort 10.43.230.186 <none> 80:30098/TCP 5m29s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/awx-operator-controller-manager 1/1 1 1 7m5s
deployment.apps/awx 1/1 1 1 5m27s
NAME DESIRED CURRENT READY AGE
replicaset.apps/awx-operator-controller-manager-68d787cfbd 1 1 1 7m5s
replicaset.apps/awx-559fcd895 1 1 1 5m27s
NAME READY AGE
statefulset.apps/awx-postgres 1/1 5m36s
NAME TYPE DATA AGE
secret/default-token-bdtcs kubernetes.io/service-account-token 3 7m5s
secret/awx-operator-controller-manager-token-t4bcn kubernetes.io/service-account-token 3 7m5s
secret/awx-custom-certs Opaque 2 6m26s
secret/awx-app-credentials Opaque 3 5m32s
secret/awx-token-qtdrv kubernetes.io/service-account-token 3 5m31s
secret/awx-admin-password Opaque 1 5m45s
secret/awx-secret-key Opaque 1 5m49s
secret/awx-postgres-configuration Opaque 6 5m38s
secret/awx-broadcast-websocket Opaque 1 5m42s
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pvc-a88f849b-93e3-4014-8a6b-1b4e63762bc0 8Gi RWO Delete Bound awx/postgres-awx-postgres-0 local-path 5m34s
Yet LDAPS still doesn't function
kubectl -n awx logs awx-559fcd895-tfxl9 -c awx-web
2021-11-11 13:34:12,888 WARNING [e1a9950bf56e4dac94410d3c1a42a4aa] django_auth_ldap Caught LDAPError while authenticating xxxx: SERVER_DOWN({'result': -1, 'desc': "Can't contact LDAP server", 'ctrls': [], 'info': 'error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed (unable to get issuer certificate)'})
@rrobe53 can you try asking on the mailing list too and report back here if you find the answer?
Added it to the mailing list.
I also duplicated this test on my Mac using Minikube and hyperkit with the same results, to take K3S out of the equation.
Could you share your LDAP configuration? And also specify the version of operator and awx that you are using? I got ldap setup worked via extra setting with awx 19.4.0 and operator built from my PR.
You can checkout my branch in this PR for reference. It did work on my Azure AKS with the ldap ca cert synced from Azure Keyvault.
https://github.com/ansible/awx-operator/pull/659
Using 19.4.0 and 0.14.0 in the initial test, just tried again with the k3s environment and 19.5.0 and 0.15.0 with the same result.
I'm using the same LDAP setup that's working on a much older version (8.0.0). I'm not applying it with extra settings, just the bare deploy above. I'm able to get the LDAP failure message by configuring essentially nothing but the LDAP server name in the LDAP settings. However I've copied everything else. I use the same ca.crt in curl and https://ldapserver:636 and it works (past the ssl handshake at least).
I had a very similar issue with CA certs. I had to provide the entire CA chain as the input to the secrets. If I provided just the CA cert, I had the same error as you. When I provided the CA and the root CA certificate, things magically started to work.
@rrobe53 you've provided a lot of information however I cannot see the results from container. Are certs properly propagated?
I've reproduced that on k3s - certificate is not getting properly updated in the container and that leads to 'unable to verify the first certificate issue' and when trying to use ldaps:
SERVER_DOWN({'result': -1, 'desc': "Can't contact LDAP server", 'ctrls': [],