ray icon indicating copy to clipboard operation
ray copied to clipboard

Release test xgboost_train_gpu.aws failed

Open can-anyscale opened this issue 1 year ago • 5 comments

Release test xgboost_train_gpu.aws failed. See https://buildkite.com/ray-project/release-tests-branch/builds/1774#0188a38a-9182-401e-b2c6-3e9d56403e1a for more details. cc @ml

 -- created by ray-test-bot

can-anyscale avatar Jun 10 '23 04:06 can-anyscale

Likely also broken by blamed commit: https://github.com/ray-project/ray/commit/30e6b292d87013b290f37399c5b04ffaaff583fb, cc: @scv119

can-anyscale avatar Jun 12 '23 16:06 can-anyscale

The blamed commit removed this pin for py37. Any reason for this @scv119?

modin==0.12.1; python_version < '3.8'

The cluster env for the release test can't build properly due to the default being py37:

[ERROR] 6/9/2023, 9:35:26 PM: ERROR: Could not find a version that satisfies the requirement modin==0.18.1 (from versions: 0.0.2, 0.1.0, 0.1.1, 0.1.2, 0.2.0, 0.2.2, 0.2.3, 0.2.4, 0.2.5, 0.3.0rc1, 0.3.0, 0.3.1, 0.4.0rc1, 0.4.0, 0.5.0rc1, 0.5.0, 0.5.1, 0.5.2, 0.5.3, 0.5.4, 0.6.0, 0.6.1, 0.6.2, 0.6.3, 0.7.0, 0.7.1, 0.7.2, 0.7.3, 0.7.4, 0.8.0, 0.8.1, 0.8.1.1, 0.8.2, 0.8.3, 0.8.3.post0, 0.8.3.post1, 0.8.3.post2, 0.8.3.post3, 0.8.3.post4, 0.9.0, 0.9.1, 0.10.0, 0.10.1, 0.10.2, 0.11.0, 0.11.1, 0.11.2, 0.11.3, 0.12.0, 0.12.1, 0.16.0, 0.16.1, 0.16.2, 0.17.0, 0.17.1)

Proposed fix: With py38 being the standard soon, we can switch this release test to py38 to fix the test for now.

justinvyu avatar Jun 13 '23 07:06 justinvyu

@justinvyu using py38 sounds good to me!

can-anyscale avatar Jun 13 '23 23:06 can-anyscale

Got it, will put a PR.

By the way, air_benchmark_xgboost_cpu_10.aws is also failing due to this same issue. Will be fixed as well. See below:

https://buildkite.com/ray-project/release-tests-branch/builds/1794#0188b3df-2418-42f7-be88-7be187494ed9

justinvyu avatar Jun 14 '23 01:06 justinvyu

Test has been failing for far too long. Jailing.

can-anyscale avatar Jun 14 '23 14:06 can-anyscale

Re-opening issue as test is still failing. Latest run: https://buildkite.com/ray-project/release-tests-branch/builds/1810#0188d3fb-7201-4cf0-a3a1-1c0f15a303b5

can-anyscale avatar Jun 19 '23 14:06 can-anyscale

Last failure was bogused

can-anyscale avatar Jun 21 '23 00:06 can-anyscale

Test passed on latest run: https://buildkite.com/ray-project/release-tests-branch/builds/1815#0188dcd4-faab-4f5b-a739-0d21b912847f

can-anyscale avatar Jun 21 '23 14:06 can-anyscale