flinkk8soperator
flinkk8soperator copied to clipboard
Job- and taskmanagers are not updated on change of docker image in FlinkApplication resource
I have updated a (running) FlinkApplication custom resource to Flink 1.9.1 and a reference to a docker image (1.0.5 below) containing a stream processor using Flink 1.9.1.
The custom resource is updated:
Name: ${namespace}-${job}
Namespace: ${namespace}
Labels: environment=DEV
version=1.0.5
Annotations: helm.fluxcd.io/antecedent: ${namespace}:helmrelease/${job}
API Version: flink.k8s.io/v1beta1
Kind: FlinkApplication
Metadata:
Creation Timestamp: 2020-01-20T09:41:24Z
Generation: 2
Resource Version: 11470689
Self Link: /apis/flink.k8s.io/v1beta1/namespaces/${namespace}/flinkapplications/${namespace}-${job}
UID: 4d097bbf-677e-4bad-8422-5bccf8c8e879
Spec:
Entry Class: ${entryClass}
Flink Version: 1.9.1
Image: ${privateDockerRegistry}/${project}/${job}:1.0.5
Image Pull Policy: Always
Image Pull Secrets:
Name: ${privateDockerRegistry}-secret
Jar Name: ${jarName}.jar
Job Manager Config:
Replicas: 1
Resources:
Requests:
Cpu: 0.2
Memory: 512Mi
Parallelism: 1
Program Args: --environments /kafka-environments-config/environments.yaml --environment DEV --checkpointing.interval: 60000 --checkpointing.minPause: 1000 --checkpointing.timeout: 60000 --team ${namespace}
Task Manager Config:
Resources:
Requests:
Cpu: 0.2
Memory: 512Mi
Task Slots: 2
Volume Mounts:
Mount Path: /kafka-environments-config/environments.yaml
Name: kafka-environments-config
Sub Path: environments.yaml
Mount Path: /usr/local/flink-conf.yaml
Name: flink-config
Sub Path: flink-conf.yaml
Volumes:
Config Map:
Name: kafka-environments-config
Name: kafka-environments-config
Config Map:
Name: ${namespace}-${job}-flink-config
Name: flink-config
Status:
Cluster Status:
Available Task Slots: 0
Deploy Hash:
Job Status:
Jar Name:
Parallelism: 0
Last Updated At: 2020-01-20T09:41:24Z
Phase: ClusterStarting
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal CreatingCluster 52m flinkK8sOperator Creating Flink cluster for deploy 1ce95fa5
However, the jobmanager and taskmanager deployments and pods are not updated/restarted to the new docker image. This is only done after deleting and adding the FlinkApplication custom resource again. I'm using the flink operator in version 0.4.0.
Moreover, even after deleting and adding the resource again, the job is not started in the successfully created cluster (similar to #161) - even with patch #163 applied. Do you have any idea what I still do wrong? The status of the FlinkApplication is like in the output of describe in my last comment.
Are there any news on this issue? I really would like to understand which changes to the custom resource trigger an update of the Flink cluster and which not.
@anekdoti I'm having a similar problem. I've opened issue #154 and, in the beginning, it was just a problem they solved at version 0.4.0. However, the update cycle always fails for me. In my case, I checked that the status of my application was DeployFailed. They suggested me to check the operator's log to see if the new image has some problem. Maybe this could help you?