xorbits
xorbits copied to clipboard
BUG: Should catch the decref error when the session has already been destoryed
Note that the issue tracker is NOT the place for general support. For discussions about development, questions about usage, or any general questions, contact us on https://discuss.xorbits.io/. Reproduce:
- init a cluster
import xorbits
xorbits.init(cuda_devices=[0])
- Run a task in another session, then destory session explicitly
import xorbits
xorbits.init('<endpoint above>')
import xorbits.numpy as np
np.random.rand(10000, 10000).to_gpu()
xorbits.shutdown()
- Exit the process that you run the task above. Then the cluster (tornado) will raise an error:
2023-04-21 07:58:04,500 xorbits._mars.services.web.core 111327 ERROR ActorNotExist when handling request with LifecycleWebAPIHandler.decref_tileables
Traceback (most recent call last):
File "/home/lichengjie/workspace/xorbits/python/xorbits/_mars/services/web/core.py", line 69, in wrapped
res = await func(self, *args, **kwargs)
File "/home/lichengjie/workspace/xorbits/python/xorbits/_mars/services/lifecycle/api/web.py", line 39, in decref_tileables
await oscar_api.decref_tileables(tileable_keys, counts=counts)
File "/home/lichengjie/workspace/xorbits/python/xorbits/_mars/services/lifecycle/api/oscar.py", line 108, in decref_tileables
return await self._lifecycle_tracker_ref.decref_tileables(tileable_keys)
File "xoscar/core.pyx", line 251, in xoscar.core.LocalActorRef.__getattr__
xoscar.errors.ActorNotExist: Actor b'2IC9l6dhZaChiD31uVp7EYKq_lifecycle_tracker' does not exist
2023-04-21 07:58:04,500 tornado.access 111327 ERROR 500 POST /api/session/2IC9l6dhZaChiD31uVp7EYKq/lifecycle?action=decref_tileables (127.0.0.1) 1.18ms
This error should not raise out which may lead to some confusions.