flink-on-k8s-operator icon indicating copy to clipboard operation
flink-on-k8s-operator copied to clipboard

Flink Cluster installation is failing with error failed calling webhook

Open vinaykw opened this issue 4 years ago • 2 comments

Kept the flink operator and flink cluster running together for more than 35 days. I am using helm for installing both flink operator and flink session cluster. When I uninstalled flink session cluster and tried to reinstall the flink session cluster I am seeing the below error:

Error: Internal error occurred: failed calling webhook "mflinkcluster.flinkoperator.k8s.io": Post https://flink-operator-webhook-service.reporting-flink-operator.svc:443/mutate-flinkoperator-k8s-io-v1beta1-flinkcluster?timeout=30s: x509: certificate has expired or is not yet valid

Please help me to resolve this issue

vinaykw avatar Dec 02 '20 10:12 vinaykw

@vinaykw please check here for workaround https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/issues/356

guruprasathT avatar Dec 08 '20 03:12 guruprasathT

based on https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/issues/356, we have tried all the mentioned steps but still the flink session cluster installation is failed.
First we tried with the below steps and it didn't helped

        kubectl get job cert-job -n flink-operator-system -oyaml > cert-job.yaml
        kubectl delete job cert-job -n flink-operator-system
        kubectl apply -f cert-job.yaml

Again tried by editing the config-map to change the default expires days and it also didn't helped us | openssl x509 -req -CA ca.crt -CAkey ca.key -CAcreateserial -out ${tmpdir}/server-cert.pem > change to: | openssl x509 -days 3650 -req -CA ca.crt -CAkey ca.key -CAcreateserial -out ${tmpdir}/server-cert.pem

k delete -f config-map-up1.yaml -n flink-operator-system
configmap "cert-configmap" deleted
 
 
k apply -f config-map-up1.yaml -n flink-operator-system
configmap/cert-configmap created
 
 
 
kubectl get pods -n flink-operator-system
NAME                                                 READY   STATUS    RESTARTS   AGE
flink-operator-controller-manager-848b69b444-8v9l5   2/2     Running   0          43m
 
 
 
k apply -f cert-job-1.yaml -n flink-operator-system
job.batch/cert-job created
 
 
kubectl get pods -n flink-operator-system
NAME                                                 READY   STATUS      RESTARTS   AGE
cert-job-lgxzt                                       0/1     Completed   0          7s
flink-operator-controller-manager-848b69b444-8v9l5   2/2     Running     0          44m
 
 
 kubectl apply -f config/samples/flinkoperator_v1beta1_flinksessioncluster.yaml
Error from server (InternalError): error when creating "config/samples/flinkoperator_v1beta1_flinksessioncluster.yaml": Internal error occurred: failed calling webhook "mflinkcluster.flinkoperator.k8s.io": Post "https://flink-operator-webhook-service.flink-operator-system.svc:443/mutate-flinkoperator-k8s-io-v1beta1-flinkcluster?timeout=30s": x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0

sumchak1 avatar Jul 16 '21 07:07 sumchak1