flink-on-k8s-operator icon indicating copy to clipboard operation
flink-on-k8s-operator copied to clipboard

Failed calling webhook: context deadline exceeded on AWS EKS

Open acesir opened this issue 5 years ago • 2 comments

We are experiencing the bellow error when attempting to create the flink job cluster using the sample provided in the repo. Our flink deployment with operator and job cluster works fine in Azure AKS but the bellow error occurs on AWS EKS.

Error from server (InternalError): error when creating "flink-on-k8s-operator-flink-operator-0.2.0/helm-chart/flink-job-cluster/flink-job-cluster.yaml": Internal error occurred: failed calling webhook "mflinkcluster.flinkoperator.k8s.io": Post https://flink-operator-webhook-service.flink-operator-system.svc:443/mutate-flinkoperator-k8s-io-v1beta1-flinkcluster?timeout=30s: context deadline exceeded

the job-cluster yaml:

apiVersion: flinkoperator.k8s.io/v1beta1
kind: FlinkCluster
metadata:
  name: flink-job-cluster
  labels:
    app: flink-job-cluster
    chart: flink-job-cluster-0.1.51
    release: flink-job-cluster
spec:
  image:
    name: "flink:1.9.3"
  envVars:
    - name: HADOOP_CLASSPATH
      value: /opt/flink/opt/flink-metrics-prometheus-1.9.3.jar    
  jobManager:
    accessScope: Cluster
    ports:
      ui: 8081
    extraPorts:
      - containerPort: 9249
        name: prom    
    resources:
      limits:
        cpu: 200m
        memory: 1024Mi
    podAnnotations:
      fluentbit.io/parser: foo
  taskManager:
    replicas: 2
    extraPorts:
      - containerPort: 9249
        name: prom
        protocol: TCP  
    resources:
      limits:
        cpu: 200m
        memory: 1024Mi
    podAnnotations:
      fluentbit.io/parser: foo
  job:
    jarFile: ./examples/streaming/WordCount.jar
    className: org.apache.flink.streaming.examples.wordcount.WordCount
    args: ["--input", "./README.txt"]
    parallelism: 
    restartPolicy: Never 
    podAnnotations:
      fluentbit.io/parser: foo
  flinkProperties:
    metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter
    taskmanager.numberOfTaskSlots: "1"

Using flink-operator 0.2.0 and EKS 1.17

acesir avatar Oct 08 '20 12:10 acesir

Hi, Did you find a resolution to this issue?

swagy-tarun avatar Feb 02 '21 10:02 swagy-tarun

Hello,

Maybe this issue have same origin than #399. See https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/issues/399#issuecomment-1206193790.

emmanuelCarre avatar Aug 05 '22 08:08 emmanuelCarre