[Bug] 使用官方镜像v0.5.1进行GLM4v的部署会有一个报错
Checklist
- [ ] 1. I have searched related issues but cannot get the expected help.
- [ ] 2. The bug has not been fixed in the latest version.
Describe the bug
(internvl) root@377ecd1ff00c:/mnt/code/yzy/InternVL# CUDA_VISIBLE_DEVICES=5 lmdeploy serve api_server /mnt/model/glm/glm4v --model-name glm4v --cache-max-entry-count 0.8
2024-07-17 08:19:57,971 - lmdeploy - WARNING - Fallback to pytorch engine because /mnt/model/glm/glm4v not supported by turbomind engine.
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:07<00:00, 1.97it/s]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
HINT: Please open http://0.0.0.0:23333 in a browser for detailed api usage!!!
HINT: Please open http://0.0.0.0:23333 in a browser for detailed api usage!!!
HINT: Please open http://0.0.0.0:23333 in a browser for detailed api usage!!!
INFO: Started server process [68826]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:23333 (Press CTRL+C to quit)
INFO: 127.0.0.1:47296 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/uvicorn/protocols/http/httptools_impl.py", line 399, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in call
return await self.app(scope, receive, send)
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/fastapi/applications.py", line 1054, in call
await super().call(scope, receive, send)
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/applications.py", line 123, in call
await self.middleware_stack(scope, receive, send)
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/middleware/errors.py", line 186, in call
raise exc
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/middleware/errors.py", line 164, in call
await self.app(scope, receive, _send)
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/middleware/cors.py", line 85, in call
await self.app(scope, receive, send)
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 65, in call
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/routing.py", line 756, in call
await self.middleware_stack(scope, receive, send)
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/routing.py", line 776, in app
await route.handle(scope, receive, send)
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/routing.py", line 297, in handle
await self.app(scope, receive, send)
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/routing.py", line 77, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/routing.py", line 72, in app
response = await func(request)
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/fastapi/routing.py", line 278, in app
raw_response = await run_endpoint_function(
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
return await dependant.call(**values)
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/lmdeploy/serve/openai/api_server.py", line 529, in chat_completions_v1
async for res in result_generator:
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/lmdeploy/serve/async_engine.py", line 571, in generate
prompt_input = await self._get_prompt_input(prompt,
File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/lmdeploy/serve/vl_async_engine.py", line 59, in _get_prompt_input
segs = decorated.split(IMAGE_TOKEN)
AttributeError: 'NoneType' object has no attribute 'split'
Reproduction
CUDA_VISIBLE_DEVICES=5 lmdeploy serve api_server /mnt/model/glm/glm4v --model-name glm4v --cache-max-entry-count 0.8
Environment
use the docker image with tag v0.5.1
Error traceback
No response
@ZhiyuYUE hi, what your code of using endpoint v1/chat/completions service? You could refer to this example:
https://lmdeploy.readthedocs.io/en/latest/serving/api_server_vl.html#integrate-with-openai
@ZhiyuYUE hi, what your code of using endpoint
v1/chat/completionsservice? You could refer to this example: https://lmdeploy.readthedocs.io/en/latest/serving/api_server_vl.html#integrate-with-openai
@RunningLeon hi, I actually use the same method to post the request. Here is the command to run the docker container: docker run --gpus all -itd --restart=always --name glm4v --privileged=true -v /nas/litaifan/glm/glm4v:/opt/lmdeploy/glm4v --env "CUDA_VISIBLE_DEVICES=3" -p 23334:23333 --ipc=host openmmlab/lmdeploy:v0.5.1 lmdeploy serve api_server glm4v --cache-max-entry-count 0.8
And my request is from openai import OpenAI
model_name = 'glm4v' print(model_name) headers = {'Content-Type': 'application/json'} data = { "model":model_name, #"model":"glm4v", "temperature": 0, "messages": [{ 'role': 'user', 'content': [{ 'type': 'text', 'text': 'Describe the image please', }, { 'type': 'image_url', 'image_url': { 'url': 'https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg', }, }], }]} import requests result = requests.post(url = 'http://192.168.10.81:23334/v1/chat/completions', headers=headers, json=data) response = result.text print(response)
then the internal server error show up. If I use the way of openai client, it is the same error
@ZhiyuYUE hi chat_template is not right when you give model_path as glm4v. you can try one of these
- add
--model-name glm4tolmdeploy serve api_server glm4v --model-name glm4 ... - mapping model_path to
glm-4vwhen creating docker containerdocker run ... -v /nas/litaifan/glm/glm4v:/opt/lmdeploy/glm-4v ... lmdeploy serve api_server glm-4v ...
@ZhiyuYUE 请问你这个解决了吗
@RunningLeon I also encountered the same problem when running batch_infer . Looks like glm-4v processcer hasn't adapted yet. https://github.com/InternLM/lmdeploy/blob/main/lmdeploy/vl/templates.py#L300
@RunningLeon I also encountered the same problem when running batch_infer . Looks like glm-4v processcer hasn't adapted yet. https://github.com/InternLM/lmdeploy/blob/main/lmdeploy/vl/templates.py#L300
hi, how do you run with batch_infer? maybe you can give a sample code to reproduce and post lmdeploy check_env if possible as well.
This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 5 days if the stale label is not removed or if there is no further response.
This issue is closed because it has been stale for 5 days. Please open a new issue if you have similar issues or you have any new updates now.