flink-on-k8s-operator
flink-on-k8s-operator copied to clipboard
How to start job after jobmanager fails for whatever reason?
Hi folks, I am still getting familiar with Flink-operator, and I would like to ask for your help with the following question. After starting a new Flink Job Cluster a new pod comes up, which submits the job for the jobmanager. After a while, it goes into the completed state, and the job keeps running. In my use case there is no persistent volume in my cluster, there is no need to set up any savepoints. All that I would like to achieve is to make sure that the job will be started again whenever the jobmanager fails. I don't need to restore anything, just run the job again. Is there any possibility for this other than set up persistent volume and savepointsdir with autosavepoints?
set HA properties for JobManager in flink.conf https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/ha/kubernetes_ha/ and it should restart JobManager state from the remote storage