flink
flink copied to clipboard
[FLINK-28515][checkpoint]Try to clean up localSnapshot files after checkpoint aborted
What is the purpose of the change
This pull request fix the problem that files in local recovery directory hasn't be clean up properly after checkpoint abort.
Brief change log
- Judge the checkpoint whether to register into TaskLocalStateStoreImpl when TaskLocalStateStoreImpl abortCheckpoint
- try to delete the localRecovery directory even if the checkpoint isn't registered into TaskLocalStateStoreImpl
Verifying this change
This change added tests and can be verified as follows: org.apache.flink.runtime.state.TaskLocalStateStoreImplTest#abortUnregisteredCheckpoint()
Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changed class annotated with
@Public(Evolving)
: no - The serializers: no
- The runtime per-record code paths (performance sensitive): no
- Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: yes
- The S3 file system connector: no
Documentation
- Does this pull request introduce a new feature? no
- If yes, how is the feature documented? no
CI report:
- 3856f807fb83149dca2d4261ef2443a9e82a1ac1 Azure: SUCCESS
Bot commands
The @flinkbot bot supports the following commands:-
@flinkbot run azure
re-run the last Azure build
@ljz2051 thanks for your contribution, could you please rebase master to resolve the conflicts, thanks.