xorbits
xorbits copied to clipboard
BUG: The cuda devices when init another session connecting to a existing cluster do not take effect
Note that the issue tracker is NOT the place for general support. For discussions about development, questions about usage, or any general questions, contact us on https://discuss.xorbits.io/. Reproduce: Now multiple GPUs leads to dead lock.
- Init a local cluster
xorbits.init()
get the endpoint in the console log output
- init another session in another console process
xorbits.init(endpoint=<endpoint above>, cuda_devices=[0])
Then submit task to this newly init session, dead lock would happen.
Therefore, the [0]
cuda devices do not take effect.