kuberay
kuberay copied to clipboard
[Test][Autoscaler] deflaky unexpected dead actors in tests by more resources
Why are these changes needed?
Use more resources to deflaky. This can pass the flaky test 200 times without failures on my mac.
Related issue number
Closes #3701
Checks
- [x] I've made sure the tests are passing.
- Testing Strategy
- [ ] Unit tests
- [x] Manual tests
- [ ] This PR is not tested :(
Why did you conclude that the flakiness is due to resource issues? The resource configuration is unexpectedly high compared to the configuration before #3707.
This makes me feel these PRs are hot fix instead of fixing the real root causes.
Why did you conclude that the flakiness is due to resource issues? The resource configuration is unexpectedly high compared to the configuration before #3707.
This makes me feel these PRs are hot fix instead of fixing the real root causes.
https://github.com/ray-project/kuberay/pull/3707 has already resolved the unexpected actor exit issue on my MacBook. However, the problem still occurs on the Buildkite CI runners. I’m wondering if this could be due to the runners having lower performance. What we could try now is increasing the resource requirement once again for the CI runners.
Yes, you are right. If this still doesn't fix the flakiness on the CI runners, we will need to find another way.