ray icon indicating copy to clipboard operation
ray copied to clipboard

[core] Deflaky test_autoscaler

Open rickyyx opened this issue 1 year ago • 6 comments

Why are these changes needed?

The current autoscaler could have race conditions between different components, resulting in excessive scaling/downscaling.

Instead of narrowing down the nasty race conditions, we are adding extensive sleeps to deflaky the test. This is also because we will be rolling out V2 autoscaler, which should make these failures no longer relevant.

Related issue number

Checks

  • [ ] I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • [ ] I've run scripts/format.sh to lint the changes in this PR.
  • [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
    • [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in doc/source/tune/api/ under the corresponding .rst file.
  • [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • [ ] Unit tests
    • [ ] Release tests
    • [ ] This PR is not tested :(

rickyyx avatar Jan 24 '24 08:01 rickyyx

image

Ok - it's passing now with extensive sleeps.

https://buildkite.com/ray-project/premerge/builds/17367#018d3e71-376a-438a-85d2-fe81600c59b5

rickyyx avatar Jan 25 '24 05:01 rickyyx

@can-anyscale - i thought there was a PR to add logs to windows CI run? Any update on that?

rickyyx avatar Jan 31 '24 06:01 rickyyx

Q: How should we track to remove this artificial sleep after releasing v2? Or is this test not relevant anymore after releasing v2?

Many of the tests would be refactored. I will be porting existing v1 tests to v2.

rickyyx avatar Jan 31 '24 06:01 rickyyx

@rickyyx i just merge it a bit earlier

can-anyscale avatar Jan 31 '24 08:01 can-anyscale

@can-anyscale thanks!

However, looks like the artifacts don't contain the test print out

Do you think it would be possible to add artifacts like these as well? C:/raytmp/6w7zvoxv/execroot/com_github_ray_project_ray/bazel-out/x64_windows-opt/testlogs/python/ray/tests/test_autoscaler/test_attempts/attempt_2.log

rickyyx avatar Jan 31 '24 18:01 rickyyx

@rickyyx i'll look into it ;)

can-anyscale avatar Jan 31 '24 23:01 can-anyscale