lmdeploy vlm-http-client 报错
🔎 Search before asking | 提交之前请先搜索
- [x] I have searched the MinerU Readme and found no similar bug report.
- [x] I have searched the MinerU Issues and found no similar bug report.
- [x] I have searched the MinerU Discussions and found no similar bug report.
🤖 Consult the online AI assistant for assistance | 在线 AI 助手咨询
- [x] I have consulted the online AI assistant but was unable to obtain a solution to the issue.
Description of the bug | 错误描述
Nov 27 16:20:31 ubuntu22lts bash[450388]: INFO: 192.168.1.161:51378 - "POST /v1/chat/completions HTTP/1.1" 200 OK Nov 27 16:20:34 ubuntu22lts bash[450388]: INFO: 192.168.1.161:60516 - "POST /v1/chat/completions HTTP/1.1" 200 OK Nov 27 16:20:37 ubuntu22lts bash[450388]: INFO: 192.168.1.161:51370 - "POST /v1/chat/completions HTTP/1.1" 200 OK Nov 27 16:20:39 ubuntu22lts bash[450388]: INFO: 192.168.1.161:44062 - "POST /v1/chat/completions HTTP/1.1" 200 OK Nov 27 16:20:40 ubuntu22lts bash[450388]: INFO: 192.168.1.161:60516 - "POST /v1/chat/completions HTTP/1.1" 200 OK Nov 27 16:20:53 ubuntu22lts bash[450388]: 2025-11-27 16:20:53,027 - lmdeploy - ERROR - async_engine.py:710 - [safe_run] exception caught: GeneratorExit Nov 27 16:20:53 ubuntu22lts bash[450388]: 2025-11-27 16:20:53,028 - lmdeploy - ERROR - async_engine.py:695 - [model_inst] exception caught: Nov 27 16:20:57 ubuntu22lts bash[450388]: 2025-11-27 16:20:57,992 - lmdeploy - ERROR - async_engine.py:710 - [safe_run] exception caught: GeneratorExit Nov 27 16:20:57 ubuntu22lts bash[450388]: 2025-11-27 16:20:57,993 - lmdeploy - ERROR - async_engine.py:695 - [model_inst] exception caught: Nov 27 16:21:01 ubuntu22lts bash[450388]: 2025-11-27 16:21:01,110 - lmdeploy - ERROR - async_engine.py:710 - [safe_run] exception caught: GeneratorExit Nov 27 16:21:01 ubuntu22lts bash[450388]: 2025-11-27 16:21:01,111 - lmdeploy - ERROR - async_engine.py:695 - [model_inst] exception caught: Nov 27 16:21:02 ubuntu22lts bash[450388]: 2025-11-27 16:21:02,475 - lmdeploy - ERROR - async_engine.py:710 - [safe_run] exception caught: GeneratorExit Nov 27 16:21:02 ubuntu22lts bash[450388]: 2025-11-27 16:21:02,478 - lmdeploy - ERROR - async_engine.py:695 - [model_inst] exception caught: Nov 27 16:21:03 ubuntu22lts bash[450388]: INFO: 192.168.1.161:54110 - "POST /v1/chat/completions HTTP/1.1" 200 OK Nov 27 16:21:03 ubuntu22lts bash[450388]: 2025-11-27 16:21:03,856 - lmdeploy - ERROR - async_engine.py:710 - [safe_run] exception caught: GeneratorExit Nov 27 16:21:03 ubuntu22lts bash[450388]: 2025-11-27 16:21:03,874 - lmdeploy - ERROR - async_engine.py:695 - [model_inst] exception caught: Nov 27 16:21:05 ubuntu22lts bash[450388]: 2025-11-27 16:21:05,483 - lmdeploy - ERROR - async_engine.py:710 - [safe_run] exception caught: GeneratorExit Nov 27 16:21:05 ubuntu22lts bash[450388]: 2025-11-27 16:21:05,508 - lmdeploy - ERROR - async_engine.py:695 - [model_inst] exception caught: Nov 27 16:21:05 ubuntu22lts bash[450388]: 2025-11-27 16:21:05,850 - lmdeploy - ERROR - async_engine.py:710 - [safe_run] exception caught: GeneratorExit Nov 27 16:21:05 ubuntu22lts bash[450388]: 2025-11-27 16:21:05,851 - lmdeploy - ERROR - async_engine.py:695 - [model_inst] exception caught: Nov 27 16:21:06 ubuntu22lts bash[450388]: 2025-11-27 16:21:06,044 - lmdeploy - ERROR - async_engine.py:710 - [safe_run] exception caught: GeneratorExit Nov 27 16:21:06 ubuntu22lts bash[450388]: 2025-11-27 16:21:06,045 - lmdeploy - ERROR - async_engine.py:695 - [model_inst] exception caught: Nov 27 16:21:06 ubuntu22lts bash[450388]: 2025-11-27 16:21:06,485 - lmdeploy - ERROR - async_engine.py:710 - [safe_run] exception caught: GeneratorExit Nov 27 16:21:06 ubuntu22lts bash[450388]: 2025-11-27 16:21:06,485 - lmdeploy - ERROR - async_engine.py:695 - [model_inst] exception caught: Nov 27 16:21:07 ubuntu22lts bash[450388]: 2025-11-27 16:21:07,105 - lmdeploy - ERROR - async_engine.py:710 - [safe_run] exception caught: GeneratorExit Nov 27 16:21:07 ubuntu22lts bash[450388]: 2025-11-27 16:21:07,105 - lmdeploy - ERROR - async_engine.py:695 - [model_inst] exception caught: Nov 27 16:21:07 ubuntu22lts bash[450388]: 2025-11-27 16:21:07,761 - lmdeploy - ERROR - async_engine.py:710 - [safe_run] exception caught: GeneratorExit Nov 27 16:21:07 ubuntu22lts bash[450388]: 2025-11-27 16:21:07,771 - lmdeploy - ERROR - async_engine.py:695 - [model_inst] exception caught: Nov 27 16:21:08 ubuntu22lts bash[450388]: 2025-11-27 16:21:08,329 - lmdeploy - ERROR - async_engine.py:710 - [safe_run] exception caught: GeneratorExit Nov 27 16:21:08 ubuntu22lts bash[450388]: 2025-11-27 16:21:08,332 - lmdeploy - ERROR - async_engine.py:695 - [model_inst] exception caught: Nov 27 16:21:08 ubuntu22lts bash[450388]: 2025-11-27 16:21:08,507 - lmdeploy - ERROR - async_engine.py:710 - [safe_run] exception caught: GeneratorExit Nov 27 16:21:08 ubuntu22lts bash[450388]: 2025-11-27 16:21:08,509 - lmdeploy - ERROR - async_engine.py:695 - [model_inst] exception caught:
How to reproduce the bug | 如何复现
server端部署: sm75 2080ti * 1 CUDA Version: 12.6
启动命令:
CUDA_VISIBLE_DEVICES=2 MINERU_MODEL_SOURCE=modelscope mineru-openai-server --engine lmdeploy --server-port 8331 --cache-max-entry-count 0.7 --tp 1
windows端vlm-http-client调用:
backend = "http-client"
middle_json, infer_result = vlm_doc_analyze(
file_bytes,
image_writer=image_writer,
backend=backend,
server_url=self.http_server_url,
)
Operating System Mode | 操作系统类型
Linux
Operating System Version| 操作系统版本
服务端Ubuntu 22.04 客户端Win11
Python version | Python 版本
3.11
Software version | 软件版本 (mineru --version)
>=2.5
Backend name | 解析后端
vlm
Device mode | 设备模式
cuda
你遇到的 lmdeploy async_engine.py 的 GeneratorExit 报错,在 MinerU 的 VLM 后端(包括 lmdeploy/vllm)中是已知现象,通常和 GPU 显存管理、CUDA/PyTorch/vLLM 兼容性或资源清理有关,尤其是在高端显卡和新 CUDA 版本下。这类报错一般不会影响最终输出结果,只要推理结果正常可以忽略,因为 vLLM/lmdeploy 的异步引擎目前没有安全关闭方法,子进程处理完后经常会异常退出[参考1]。
如果想减少或规避这类报错,可以尝试以下方法:
- 升级 MinerU 至最新稳定版(建议 ≥2.5.3),并使用兼容的 vLLM/lmdeploy 镜像[参考2]。
- 降低 GPU 显存占用,比如调整 --cache-max-entry-count、MINERU_VIRTUAL_VRAM_SIZE 等参数。
- 优化 batch size 和 max_model_len,减少单次推理显存压力。
- 每次推理后手动清理显存(torch.cuda.empty_cache()、gc.collect())。
- 确保 PyTorch 和 CUDA 版本完全兼容,必要时安装 flash-attn。
- 设置 OMP_NUM_THREADS=1,防止多线程死锁(2.6.4 版本已自动设置)。
- 如果 vLLM/lmdeploy 后端始终不稳定,可以切换到 pipeline 后端,稳定性更高。
只要输出内容正常,这个报错一般不会影响最终结果。如需进一步定位,可开启 MINERU_LOG_LEVEL=DEBUG 查看详细日志,或关注 dmesg 是否有 OOM 记录[参考3]。
如果你发现推理结果有缺失或异常,再进一步排查环境和依赖版本一致性。
To reply, just mention @dosu.
How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other
看上去像是lmdeploy的问题,可以反馈到https://github.com/InternLM/lmdeploy 这个仓库
@myhloli @dosu
有点小问题可否问下?:
- 在windows的python客户端,lmdeploy引擎:
backend = "lmdeploy-engine"
middle_json, infer_result = vlm_doc_analyze(
file_bytes,
image_writer=image_writer,
backend=backend,
)
- 推理完了之后,如何释放这个显存?我只希望及时释放由这个lmdeploy引擎占用的显存,不希望影响到其他模型,如何做?
- 我如何知道lmdeploy是否已经就绪,因为客户端电脑上的显存有限,多个进程都在争夺显存,我希望知道此刻lmdeploy是否已经预热并成功占用着显存。
- vlm_doc_analyze的并发数如何调整,也就是并发调用lmdeploy
- 如何在python中设置lmdeploy启动参数?例如--cache-max-entry-count
- 请问下目前在linux上使用vllm和lmdeploy,哪个吞吐量更高一些?针对不同架构的显卡,例如sm7x sm8x
可以参考 https://github.com/opendatalab/MinerU/issues/2929 使用deepwiki咨询ai来获得解答。
现在是不是只能使用vllm lmploy这种,ollama lmstuio加载你们训练的模型,是无法使用backend = "vlm-http-client"是吗