ray icon indicating copy to clipboard operation
ray copied to clipboard

[core] ray.init raises Error when giving _temp_dir argument

Open Crispy13 opened this issue 2 years ago • 5 comments

What happened + What you expected to happen

Jupyter Server python 3.10.6

ray.init(num_cpus = 32)

works But

ray.init(num_cpus=32, _temp_dir =".tmp/ray")

raised:

Traceback (most recent call last):
  File "python3.10/site-packages/ray/_private/node.py", line 312, in __init__
    ray._private.services.wait_for_node(
  File "python3.10/site-packages/ray/_private/services.py", line 385, in wait_for_node
    raise TimeoutError("Timed out while waiting for node to startup.")
TimeoutError: Timed out while waiting for node to startup.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "python3.10/site-packages/IPython/core/interactiveshell.py", line 3398, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "/tmp/ipykernel_1324842/4179850511.py", line 2, in <cell line: 2>
    ray.init(num_cpus=_UG['cores'], _temp_dir =".tmp/ray")
  File "python3.10/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
    return func(*args, **kwargs)
  File "python3.10/site-packages/ray/_private/worker.py", line 1429, in init
    _global_node = ray._private.node.Node(
  File "python3.10/site-packages/ray/_private/node.py", line 319, in __init__
    raise Exception(
Exception: The current node has not been updated within 30 seconds, this could happen because of some of the Ray processes failed to startup.

Versions / Dependencies

CentOS Python 3.10.6 Ray 2.1.0 Jupyter Server

Reproduction script

ray.init(num_cpus= 32, _temp_dir = ".tmp")

Issue Severity

No response

Crispy13 avatar Nov 25 '22 04:11 Crispy13

I could reproduce the issue.

rkooo567 avatar Nov 25 '22 07:11 rkooo567

Hi, could I be assigned to resolve this? Timeline would be in the next two weeks at most. Thank you

asaacke avatar Nov 30 '22 16:11 asaacke

Got this error on ray 1.13 too. I try to set temp_dir by init(_temp_dir="xxx") or set RAY_TMPDIR, and got same error.

LinWencong avatar Dec 12 '22 09:12 LinWencong

hi may I know if this is solved? I also got this issue in the M1 Mac environment.

galvaniquee avatar Mar 28 '23 15:03 galvaniquee

Can reproduce. It happens on local laptop. num_cpus is irrelevant. It happens when _temp_dir is a relative path. The path is created but the latest_session symlink is not correctly set up.

I will make a fix. The workaround: don't use relative path. Try "/tmp/ray" or os.path.abspath(".tmp/ray").

rynewang avatar Jun 14 '23 21:06 rynewang