OpenLLM
OpenLLM copied to clipboard
bug: Error in sending post request for bentoml container service
Describe the bug
After building openllm to generate service and runner, then run the docker image as following:
Server:
$ docker run --rm --gpus all -p 3000:3000 -it mymodel-service:12345 start-runner-server --runner-name llm-mistral-runner
Starting RunnerServer from "/home/bentoml/bento" running on http://0.0.0.0:3000 (Press CTRL+C to quit)
Starting RunnerServer from "/home/bentoml/bento" running on http://0.0.0.0:3000 (Press CTRL+C to quit)
.......
(RayWorkerVllm pid=559) INFO 02-13 19:58:42 model_runner.py:547] Graph capturing finished in 35 secs.
Client:
- to send requests to the service: i got this error!!! Please could you help me in retrieving requests to the service!
$ curl -X 'POST' 'http://localhost:3000/generate_iterator' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{
"prompt": "Explain superconductors like Im five years old"}'
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/bentoml/_internal/server/http/traffic.py", line 26, in __call__
await self.app(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/opentelemetry/instrumentation/asgi/__init__.py", line 596, in __call__
await self.app(scope, otel_receive, otel_send)
File "/usr/local/lib/python3.11/site-packages/bentoml/_internal/server/http/instruments.py", line 252, in __call__
await self.app(scope, receive, wrapped_send)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 758, in __call__
await self.middleware_stack(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 778, in app
await route.handle(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 299, in handle
await self.app(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 79, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 74, in app
response = await func(request)
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/bentoml/_internal/server/runner_app.py", line 295, in _request_handler
arg_num = int(request.headers["args-number"])
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette/datastructures.py", line 565, in __getitem__
raise KeyError(key)
KeyError: 'args-number'
During handling of the above exception, another exception occurred:
+ Exception Group Traceback (most recent call last):
| File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__
| await self.app(scope, receive, _send)
| File "/usr/local/lib/python3.11/site-packages/bentoml/_internal/server/http/traffic.py", line 23, in __call__
| async with anyio.create_task_group():
| File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 678, in __aexit__
| raise BaseExceptionGroup(
| ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
+-+---------------- 1 ----------------
| Traceback (most recent call last):
| File "/usr/local/lib/python3.11/site-packages/bentoml/_internal/server/http/traffic.py", line 26, in __call__
| await self.app(scope, receive, send)
| File "/usr/local/lib/python3.11/site-packages/opentelemetry/instrumentation/asgi/__init__.py", line 596, in __call__
| await self.app(scope, otel_receive, otel_send)
| File "/usr/local/lib/python3.11/site-packages/bentoml/_internal/server/http/instruments.py", line 252, in __call__
| await self.app(scope, receive, wrapped_send)
| File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
| await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
| File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
| raise exc
| File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
| await app(scope, receive, sender)
| File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 758, in __call__
| await self.middleware_stack(scope, receive, send)
| File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 778, in app
| await route.handle(scope, receive, send)
| File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 299, in handle
| await self.app(scope, receive, send)
| File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 79, in app
| await wrap_app_handling_exceptions(app, request)(scope, receive, send)
| File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
| raise exc
| File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
| await app(scope, receive, sender)
| File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 74, in app
| response = await func(request)
| ^^^^^^^^^^^^^^^^^^^
| File "/usr/local/lib/python3.11/site-packages/bentoml/_internal/server/runner_app.py", line 295, in _request_handler
| arg_num = int(request.headers["args-number"])
| ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
| File "/usr/local/lib/python3.11/site-packages/starlette/datastructures.py", line 565, in __getitem__
| raise KeyError(key)
| KeyError: 'args-number'
+------------------------------------
To reproduce
No response
Logs
No response
Environment
$ bentoml -v
bentoml, version 1.1.11
$openllm -v
openllm, 0.4.45.dev2 (compiled: False)
Python (CPython) 3.11.7
System information (Optional)
No response
@aarnphm @bojiang could you please check?
close for openllm 0.6