ray icon indicating copy to clipboard operation
ray copied to clipboard

[core] handle unserializable user exception

Open hongchaodeng opened this issue 10 months ago • 1 comments

If I run the following script, core worker would fail to serialized the RayError and crash

(credit to @alexeykudinkin )

from time import sleep

import ray

ray.init()

@ray.remote(num_cpus=1)
class Callee:

    def bar(self):
        from tenacity import retry, stop_after_attempt

        @retry(stop=stop_after_attempt(1))
        def failing_method():
            raise ValueError("failed")

        failing_method()


@ray.remote(num_cpus=1)
class Caller:

    def __init__(self, h):
        self.callee = h

    def foo(self):
        ref = self.callee.bar.remote()
        ray.wait([ref])


callee = Callee.remote()
caller = Caller.remote(callee)

ray.wait([caller.foo.remote()])

sleep(1800)

This PR will fix this.

hongchaodeng avatar Apr 19 '24 22:04 hongchaodeng

@jjyao Is there any existing tests on testing exceptions?

hongchaodeng avatar Apr 30 '24 15:04 hongchaodeng

@jjyao Test and doc added. PTAL!

hongchaodeng avatar Apr 30 '24 21:04 hongchaodeng

@jjyao Ready for review. PTAL

hongchaodeng avatar May 01 '24 18:05 hongchaodeng