ray
ray copied to clipboard
[core] Deflaky test_autoscaler
Why are these changes needed?
The current autoscaler could have race conditions between different components, resulting in excessive scaling/downscaling.
Instead of narrowing down the nasty race conditions, we are adding extensive sleeps to deflaky the test. This is also because we will be rolling out V2 autoscaler, which should make these failures no longer relevant.
Related issue number
Checks
- [ ] I've signed off every commit(by using the -s flag, i.e.,
git commit -s
) in this PR. - [ ] I've run
scripts/format.sh
to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I added a
method in Tune, I've added it in
doc/source/tune/api/
under the corresponding.rst
file.
- [ ] I've added any new APIs to the API Reference. For example, if I added a
method in Tune, I've added it in
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
- [ ] Unit tests
- [ ] Release tests
- [ ] This PR is not tested :(
Ok - it's passing now with extensive sleeps.
https://buildkite.com/ray-project/premerge/builds/17367#018d3e71-376a-438a-85d2-fe81600c59b5
@can-anyscale - i thought there was a PR to add logs to windows CI run? Any update on that?
Q: How should we track to remove this artificial sleep after releasing v2? Or is this test not relevant anymore after releasing v2?
Many of the tests would be refactored. I will be porting existing v1 tests to v2.
@rickyyx i just merge it a bit earlier
@can-anyscale thanks!
However, looks like the artifacts don't contain the test print out
Do you think it would be possible to add artifacts like these as well? C:/raytmp/6w7zvoxv/execroot/com_github_ray_project_ray/bazel-out/x64_windows-opt/testlogs/python/ray/tests/test_autoscaler/test_attempts/attempt_2.log
@rickyyx i'll look into it ;)