Matti Picus
Matti Picus
right. The question is why? What could be sending a stop signal? I scanned the logs, all I could find was `[2022-08-11 19:24:20,138 E 14084 13392] (raylet.exe) agent_manager.cc:107: The raylet...
@stephano41 when the process is halted, what is the last status report? For me it halts after about 1hr7min ``` == Status == Current time: 2022-08-11 19:22:53 (running for 01:07:37.02)...
I see an unusual message from the `runtime_env` module at the end of the `dashboard_agent.log` at the time the process is shutting down. It seems the `TUNE_ORIG_WORKING_DIR` is used in...
In the first script, after setting `gpus_per_trial=0` it runs all 250 trials without failing in 5hr40min. I wonder if there is a problem with the torchvision GPU code causing a...
Updating pytorch to 1.13 did not solve the problem ``` conda install -c pytorch pytorch==1.12 torchvision=0.13 REM fix the "paging file too small problem" del d:\miniconda\envs\issue27646\lib\site-packages\torch\lib\*.dll_bak python d:\temp\fixNvPe.py --input d:\miniconda\envs\issue27646\lib\site-packages\torch\lib\*.dll...
Not connected to this issue, but note how I installed ray [above](https://github.com/ray-project/ray/issues/27646#issuecomment-1212004336). You must use pip, there is no ray conda package yet.
Perhaps my use of "must" is a bit extreme, there are [conda packages](https://anaconda.org/conda-forge/ray-core) just none yet for python3.10+ and ray 1.13+
~All CI is passing, it seems readthedocs is stuck.~
Windows CI is failing: - `test_unpickleable_stacktrace` is failing with `RuntimeError: Failed to unpickle serialized exception` - Some tests are timing out `//python/ray/serve:test_long_poll`, `//python/ray/serve:tutorial_rllib`, `//python/ray/tests:test_basic_2`, `//python/ray/tests:test_multiprocessing`, `//python/ray/tests:test_iter`, `//python/ray/tests:test_asyncio`, `//python/ray/tests:test_advanced_4`, `//python/ray/tests:test_component_failures`
Documentation-only PR #27529 is also suffering from timeouts on windows.