ray icon indicating copy to clipboard operation
ray copied to clipboard

[core] GCS get all actors info no longer need to specify if dead jobs' actors need to be included

Open rickyyx opened this issue 3 years ago • 5 comments

Why are these changes needed?

With https://github.com/ray-project/ray/pull/31019, we always have up to 10k dead actors cached by default.

There's no existing usecase where we don't want to have actors from dead jobs (AFAIK)

With https://github.com/ray-project/ray/pull/34348, one could filter actors with job id as well.

Related issue number

Checks

  • [ ] I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • [ ] I've run scripts/format.sh to lint the changes in this PR.
  • [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
    • [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in doc/source/tune/api/ under the corresponding .rst file.
  • [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • [ ] Unit tests
    • [ ] Release tests
    • [ ] This PR is not tested :(

rickyyx avatar Apr 14 '23 21:04 rickyyx

cc @alanwguo is this flag used from you guys side?

No, we don't use that flag today. In the progress bar summary API we currently get the actor name from the actor task (parse the actor name from <ActorName>.<method_name>). This was a hack because the actors list api did not return actors for dead jobs so we couldn't get the actor name from the actor id.

Now, we can use the actor api to get the actor name from actor ID but we haven't made that change yet.

alanwguo avatar Apr 17 '23 16:04 alanwguo

Don't have a strong opinion here. Just for historical context, IIRC the 10k actor limit and per-job cleanup were there to avoid leaking memory.

wuisawesome avatar Apr 17 '23 22:04 wuisawesome

cc @rickyyx do you think it is hard to handle @alanwguo's comment in this PR? Seems like an easy change.

rkooo567 avatar Apr 18 '23 00:04 rkooo567

cc @scv119 to review gcs_service proto changes

rkooo567 avatar Apr 18 '23 00:04 rkooo567

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

  • If you'd like to keep this open, just leave any comment, and the stale label will be removed.

stale[bot] avatar Jun 10 '23 04:06 stale[bot]

TODO

rickyyx avatar Jun 12 '23 01:06 rickyyx

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

  • If you'd like to keep this open, just leave any comment, and the stale label will be removed.

stale[bot] avatar Jul 15 '23 02:07 stale[bot]

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

  • If you'd like to keep this open, just leave any comment, and the stale label will be removed.

stale[bot] avatar Sep 17 '23 00:09 stale[bot]

Hi again! The issue will be closed because there has been no more activity in the 14 days since the last message.

Please feel free to reopen or open a new issue if you'd still like it to be addressed.

Again, you can always ask for help on our discussion forum or Ray's public slack channel.

Thanks again for opening the issue!

stale[bot] avatar Oct 15 '23 11:10 stale[bot]