DeepSpeed-MII
DeepSpeed-MII copied to clipboard
What is the recommended way to wrap deepspeed mii.client in a service?
We tried to wrap mii.client in a fastAPI service, but it throw below error when client.generate is triggered. I suspect it's because mii.client already uses an event loop.
The reason we want this wrapper is flexibility, for example, we will be able to add some control logic to reject request when the load is above certain threshold, we will able to able to report load, etc.
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/opt/conda/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
return await self.app(scope, receive, send)
File "/opt/conda/lib/python3.10/site-packages/fastapi/applications.py", line 289, in __call__
await super().__call__(scope, receive, send)
File "/opt/conda/lib/python3.10/site-packages/starlette/applications.py", line 122, in __call__
await self.middleware_stack(scope, receive, send)
File "/opt/conda/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in __call__
raise exc
File "/opt/conda/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in __call__
await self.app(scope, receive, _send)
File "/opt/conda/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
raise exc
File "/opt/conda/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
await self.app(scope, receive, sender)
File "/opt/conda/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
raise e
File "/opt/conda/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
await self.app(scope, receive, send)
File "/opt/conda/lib/python3.10/site-packages/starlette/routing.py", line 718, in __call__
await route.handle(scope, receive, send)
File "/opt/conda/lib/python3.10/site-packages/starlette/routing.py", line 276, in handle
await self.app(scope, receive, send)
File "/opt/conda/lib/python3.10/site-packages/starlette/routing.py", line 66, in app
response = await func(request)
File "/opt/conda/lib/python3.10/site-packages/fastapi/routing.py", line 273, in app
raw_response = await run_endpoint_function(
File "/opt/conda/lib/python3.10/site-packages/fastapi/routing.py", line 190, in run_endpoint_function
return await dependant.call(**values)
File "/tmp/deepspeed-mii/relax/deepspeed_mii/serving/api.py", line 11, in serve
model_responses = handler.client.generate(**(req.model_dump()))
File "/opt/conda/lib/python3.10/site-packages/mii/backend/client.py", line 74, in generate
return self.asyncio_loop.run_until_complete(
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 625, in run_until_complete
self._check_running()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 586, in _check_running
raise RuntimeError(
RuntimeError: Cannot run the event loop while another loop is running
We are aware that the client does not work well with fastapi. It's being investigated currently and we will share an update when we can! Thanks