ray icon indicating copy to clipboard operation
ray copied to clipboard

[Core]Fix PG leakage caused by GCS restart when PG has not been successfully remove after the job died

Open larrylian opened this issue 1 year ago • 0 comments

Why are these changes needed?

Reproduction steps:
1 . Task has finished. 2. Start deleting the PG of this task. 3. However, when PG has not been deleted successfully, GCS restarts. 4. This will lead to PG leakage after gcs restarts, and PG remains in the "creating" status.

I will continue to analyze the problem of GCS restart when creating Actor is released, and I will fix this similar issue in a new PR In a few days.

Related issue number

Checks

  • [x] I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • [x] I've run scripts/format.sh to lint the changes in this PR.
  • [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
    • [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in doc/source/tune/api/ under the corresponding .rst file.
  • [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • [x] Unit tests
    • [ ] Release tests
    • [ ] This PR is not tested :(

larrylian avatar May 25 '23 12:05 larrylian