lmdeploy [Bug] 使用官方镜像v0.5.1进行GLM4v的部署会有一个报错

Checklist

[ ] 1. I have searched related issues but cannot get the expected help.
[ ] 2. The bug has not been fixed in the latest version.

Describe the bug

(internvl) root@377ecd1ff00c:/mnt/code/yzy/InternVL# CUDA_VISIBLE_DEVICES=5 lmdeploy serve api_server /mnt/model/glm/glm4v --model-name glm4v --cache-max-entry-count 0.8 2024-07-17 08:19:57,971 - lmdeploy - WARNING - Fallback to pytorch engine because /mnt/model/glm/glm4v not supported by turbomind engine. Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:07<00:00, 1.97it/s] Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. HINT: Please open http://0.0.0.0:23333 in a browser for detailed api usage!!! HINT: Please open http://0.0.0.0:23333 in a browser for detailed api usage!!! HINT: Please open http://0.0.0.0:23333 in a browser for detailed api usage!!! INFO: Started server process [68826] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:23333 (Press CTRL+C to quit) INFO: 127.0.0.1:47296 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error ERROR: Exception in ASGI application Traceback (most recent call last): File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/uvicorn/protocols/http/httptools_impl.py", line 399, in run_asgi result = await app( # type: ignore[func-returns-value] File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in call return await self.app(scope, receive, send) File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/fastapi/applications.py", line 1054, in call await super().call(scope, receive, send) File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/applications.py", line 123, in call await self.middleware_stack(scope, receive, send) File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/middleware/errors.py", line 186, in call raise exc File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/middleware/errors.py", line 164, in call await self.app(scope, receive, _send) File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/middleware/cors.py", line 85, in call await self.app(scope, receive, send) File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 65, in call await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send) File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app raise exc File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app await app(scope, receive, sender) File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/routing.py", line 756, in call await self.middleware_stack(scope, receive, send) File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/routing.py", line 776, in app await route.handle(scope, receive, send) File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/routing.py", line 297, in handle await self.app(scope, receive, send) File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/routing.py", line 77, in app await wrap_app_handling_exceptions(app, request)(scope, receive, send) File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app raise exc File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app await app(scope, receive, sender) File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/starlette/routing.py", line 72, in app response = await func(request) File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/fastapi/routing.py", line 278, in app raw_response = await run_endpoint_function( File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/fastapi/routing.py", line 191, in run_endpoint_function return await dependant.call(**values) File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/lmdeploy/serve/openai/api_server.py", line 529, in chat_completions_v1 async for res in result_generator: File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/lmdeploy/serve/async_engine.py", line 571, in generate prompt_input = await self._get_prompt_input(prompt, File "/root/miniforge-pypy3/envs/internvl/lib/python3.9/site-packages/lmdeploy/serve/vl_async_engine.py", line 59, in _get_prompt_input segs = decorated.split(IMAGE_TOKEN) AttributeError: 'NoneType' object has no attribute 'split'

Reproduction

CUDA_VISIBLE_DEVICES=5 lmdeploy serve api_server /mnt/model/glm/glm4v --model-name glm4v --cache-max-entry-count 0.8

Environment

use the docker image with tag v0.5.1

Error traceback

No response

Jul 17 '24 08:07 ZhiyuYUE

@ZhiyuYUE hi, what your code of using endpoint v1/chat/completions service? You could refer to this example: https://lmdeploy.readthedocs.io/en/latest/serving/api_server_vl.html#integrate-with-openai

Jul 17 '24 10:07 RunningLeon

@ZhiyuYUE hi, what your code of using endpoint v1/chat/completions service? You could refer to this example: https://lmdeploy.readthedocs.io/en/latest/serving/api_server_vl.html#integrate-with-openai

@RunningLeon hi, I actually use the same method to post the request. Here is the command to run the docker container: docker run --gpus all -itd --restart=always --name glm4v --privileged=true -v /nas/litaifan/glm/glm4v:/opt/lmdeploy/glm4v --env "CUDA_VISIBLE_DEVICES=3" -p 23334:23333 --ipc=host openmmlab/lmdeploy:v0.5.1 lmdeploy serve api_server glm4v --cache-max-entry-count 0.8

And my request is from openai import OpenAI

model_name = 'glm4v' print(model_name) headers = {'Content-Type': 'application/json'} data = { "model":model_name, #"model":"glm4v", "temperature": 0, "messages": [{ 'role': 'user', 'content': [{ 'type': 'text', 'text': 'Describe the image please', }, { 'type': 'image_url', 'image_url': { 'url': 'https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg', }, }], }]} import requests result = requests.post(url = 'http://192.168.10.81:23334/v1/chat/completions', headers=headers, json=data) response = result.text print(response)

then the internal server error show up. If I use the way of openai client, it is the same error

Jul 18 '24 03:07 ZhiyuYUE

@ZhiyuYUE hi chat_template is not right when you give model_path as glm4v. you can try one of these

add --model-name glm4 to lmdeploy serve api_server glm4v --model-name glm4 ...
mapping model_path to glm-4v when creating docker container docker run ... -v /nas/litaifan/glm/glm4v:/opt/lmdeploy/glm-4v ... lmdeploy serve api_server glm-4v ...

Jul 18 '24 05:07 RunningLeon

@ZhiyuYUE 请问你这个解决了吗

Jul 23 '24 03:07 ajskld

@RunningLeon I also encountered the same problem when running batch_infer . Looks like glm-4v processcer hasn't adapted yet. https://github.com/InternLM/lmdeploy/blob/main/lmdeploy/vl/templates.py#L300

Aug 05 '24 08:08 jaytsien

@RunningLeon I also encountered the same problem when running batch_infer . Looks like glm-4v processcer hasn't adapted yet. https://github.com/InternLM/lmdeploy/blob/main/lmdeploy/vl/templates.py#L300

hi, how do you run with batch_infer? maybe you can give a sample code to reproduce and post lmdeploy check_env if possible as well.

Aug 05 '24 10:08 RunningLeon

This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 5 days if the stale label is not removed or if there is no further response.

Aug 16 '24 02:08 github-actions[bot]

This issue is closed because it has been stale for 5 days. Please open a new issue if you have similar issues or you have any new updates now.

Aug 22 '24 02:08 github-actions[bot]