Zhi Lin
Zhi Lin
Hi @stephanie-wang , I realized that it's possible to return spark rdd data from executor actors through ray calls, because RDD and Partition is serializable. I have implemented this, but...
hi @jjyao , I have some problem to make this work in a cluster now, so I have not added the tests yet. But the effect we want is something...
It seems to be able to solve our problem. But it'll need an external HA storage, if I understand correctly? How is it different than saving our spark results to...
I see. This PR aims to apply ray and spark's lineage-based recovery to handle object loss. I think there are no conflicts between this PR and the Object HA PR....
@scv119 Yes, as long as the actos's task can be executed by another actor and yield the same output. I'll add a unit test soon.
@jjyao, as you said, this pr is not that generic. I'm trying to see if the executor can be restarted in raydp now. I was thinking this should be simpler...
Hi @YeahNew , the latest stable version of raydp support ray 1.3.0. Ray 1.4.0 changes the signature, that's why you see this message. Have you tried raydp 0.3.0 with Ray...
What's the error message?
This is weird. I just tried and it works. Just to make sure, did you restart the ray cluster after installing ray 1.3.0? It complains MalformedJsonException, is there any illegal...
I intended to have a branch which is compatible with ray-nightly. But if there is no need, we may just focus on 1.3.0.