flink-on-k8s-operator
flink-on-k8s-operator copied to clipboard
How to recover the job manager from Checkpoints
Hi,
How do we recover the job manager from the checkpoints instead of savepoints? Any instructions steps to follow please share.
Thanks, Shravan
Recovering from checkpoints is transparent to the operator, it is handled by Flink itself, you don't need to worry about it.
@functicons I am documenting the resiliency testing by disrupting taskmanagers/job managers and would like to understand how the recovery happens. Is there a way you can help my testing? Would it be possible to connect. with you offline? Also, I have setup a 3 node zookeeper along with the operator and Flink cluster but I am having issues setup the high availability configuration to perform the disruption testing. Just need some pointers on these 2 items.
@shravangit20 I would be interested in how (if) you ultimately solutioned this. Presently I am locating the last available checkpoint and feeding it back manually in the fromSavepoint
parameter manually when the job fails.