Kai Fricke
Kai Fricke
I think the only remaining issues are #31182 and #31293. 1. Why is there no test failing for this? 2. Can we prioritize fixing that?
The usual workflow is to start a separate Ray cluster for each tuning job. The reason for this is that two concurrently running tune runs will compete for resources, which...
It's also been discussed on Slack, e.g. - https://ray-distributed.slack.com/archives/CNECXMW22/p1673356718620379 - https://ray-distributed.slack.com/archives/CNECXMW22/p1641502218000300 so we may want to see if we can better support this in the future. To be clear, the...
Related issue with a workaround: https://github.com/ray-project/ray/issues/30091 We'll aim to include a fix in Ray 2.4.
> @krfricke tests has passed, but when I try to add a `ci/env/install_java.sh`, tests failed due to permission denied. How to solve it? Did you enable the execution flag with...
Also, you are building on a pretty old base master branch. Could you please merge the latest upstream master into this PR?
We are now building wheels for every commit with #31522. We may pick this onto the latest release (2.2.0), alternatively it will be definitely included in the next release.
I'll look into it - generally it should work as 2.1 also uses the new CI system.
- `rayproject/ray:latest` always points to the latest release - `rayproject/ray:nightly` points to the latest _master_ build - E.g. `rayproject/ray:2.0.0` points to the Ray 2.0.0 release - You can see the...
It's very interesting to see the difference between failing and non-failing runs. In failing runs, we often see very long times before training actually starts. E.g. [here](https://console.anyscale-staging.com/o/anyscale-internal/jobs/prodjob_iq5y3e4b4eputy95ymyay164s4) (passing) ``` Training...