flinkk8soperator icon indicating copy to clipboard operation
flinkk8soperator copied to clipboard

Job- and taskmanagers are not updated on change of docker image in FlinkApplication resource

Open anekdoti opened this issue 5 years ago • 3 comments

I have updated a (running) FlinkApplication custom resource to Flink 1.9.1 and a reference to a docker image (1.0.5 below) containing a stream processor using Flink 1.9.1.

The custom resource is updated:

Name:         ${namespace}-${job}
Namespace:    ${namespace}
Labels:       environment=DEV
              version=1.0.5
Annotations:  helm.fluxcd.io/antecedent: ${namespace}:helmrelease/${job}
API Version:  flink.k8s.io/v1beta1
Kind:         FlinkApplication
Metadata:
  Creation Timestamp:  2020-01-20T09:41:24Z
  Generation:          2
  Resource Version:    11470689
  Self Link:           /apis/flink.k8s.io/v1beta1/namespaces/${namespace}/flinkapplications/${namespace}-${job}
  UID:                 4d097bbf-677e-4bad-8422-5bccf8c8e879
Spec:
  Entry Class:        ${entryClass}
  Flink Version:      1.9.1
  Image:              ${privateDockerRegistry}/${project}/${job}:1.0.5
  Image Pull Policy:  Always
  Image Pull Secrets:
    Name:    ${privateDockerRegistry}-secret
  Jar Name:  ${jarName}.jar
  Job Manager Config:
    Replicas:  1
    Resources:
      Requests:
        Cpu:     0.2
        Memory:  512Mi
  Parallelism:   1
  Program Args:  --environments /kafka-environments-config/environments.yaml --environment DEV --checkpointing.interval: 60000 --checkpointing.minPause: 1000 --checkpointing.timeout: 60000 --team ${namespace}
  Task Manager Config:
    Resources:
      Requests:
        Cpu:     0.2
        Memory:  512Mi
    Task Slots:  2
  Volume Mounts:
    Mount Path:  /kafka-environments-config/environments.yaml
    Name:        kafka-environments-config
    Sub Path:    environments.yaml
    Mount Path:  /usr/local/flink-conf.yaml
    Name:        flink-config
    Sub Path:    flink-conf.yaml
  Volumes:
    Config Map:
      Name:  kafka-environments-config
    Name:    kafka-environments-config
    Config Map:
      Name:  ${namespace}-${job}-flink-config
    Name:    flink-config
Status:
  Cluster Status:
    Available Task Slots:  0
  Deploy Hash:
  Job Status:
    Jar Name:
    Parallelism:    0
  Last Updated At:  2020-01-20T09:41:24Z
  Phase:            ClusterStarting
Events:
  Type    Reason           Age   From              Message
  ----    ------           ----  ----              -------
  Normal  CreatingCluster  52m   flinkK8sOperator  Creating Flink cluster for deploy 1ce95fa5

However, the jobmanager and taskmanager deployments and pods are not updated/restarted to the new docker image. This is only done after deleting and adding the FlinkApplication custom resource again. I'm using the flink operator in version 0.4.0.

anekdoti avatar Jan 20 '20 10:01 anekdoti

Moreover, even after deleting and adding the resource again, the job is not started in the successfully created cluster (similar to #161) - even with patch #163 applied. Do you have any idea what I still do wrong? The status of the FlinkApplication is like in the output of describe in my last comment.

anekdoti avatar Jan 20 '20 10:01 anekdoti

Are there any news on this issue? I really would like to understand which changes to the custom resource trigger an update of the Flink cluster and which not.

anekdoti avatar Jan 21 '20 15:01 anekdoti

@anekdoti I'm having a similar problem. I've opened issue #154 and, in the beginning, it was just a problem they solved at version 0.4.0. However, the update cycle always fails for me. In my case, I checked that the status of my application was DeployFailed. They suggested me to check the operator's log to see if the new image has some problem. Maybe this could help you?

lucaspg96 avatar Jan 29 '20 11:01 lucaspg96