lmdeploy
lmdeploy copied to clipboard
[Bug] 使用lmdeploy部署模型internvl2.5/3系列模型,部署在24小时之内会断掉怎么回事?
Checklist
- [ ] 1. I have searched related issues but cannot get the expected help.
- [ ] 2. The bug has not been fixed in the latest version.
- [ ] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
Describe the bug
[Bug] 使用lmdeploy部署模型internvl2.5/3系列模型,部署在24小时之内会断掉怎么回事?
Reproduction
CUDA_VISIBLE_DEVICES=2,4 lmdeploy serve api_server /data22/ljc/proj/ckpt/InternVL2_5-38B-MPO-AWQ --server-port 2556 --cache-max-entry-count 0.5 --tp 2 > internvl2.5_38B.log 2>&1 &
Environment
Package Version
--------------------------------- -------------
accelerate 1.9.0
addict 2.4.0
aiohappyeyeballs 2.6.1
aiohttp 3.12.14
aiosignal 1.4.0
airportsdata 20250706
annotated-types 0.7.0
anyio 4.9.0
astor 0.8.1
attrs 25.3.0
av 15.0.0
beautifulsoup4 4.13.4
blake3 1.0.5
cachetools 6.1.0
cbor 1.0.0
cbor2 5.6.5
certifi 2025.7.14
cffi 1.17.1
charset-normalizer 3.4.2
click 8.2.1
cloudpickle 3.1.1
compressed-tensors 0.10.2
cupy-cuda12x 13.5.1
datasets 4.0.0
depyf 0.19.0
dill 0.3.8
diskcache 5.6.3
distro 1.9.0
dnspython 2.7.0
einops 0.8.1
email_validator 2.2.0
et_xmlfile 2.0.0
fastapi 0.116.1
fastapi-cli 0.0.8
fastapi-cloud-cli 0.1.4
fastrlock 0.8.3
filelock 3.18.0
fire 0.7.0
FlagEmbedding 1.3.5
frozenlist 1.7.0
fsspec 2025.3.0
genson 1.3.0
gguf 0.17.1
h11 0.16.0
hf-xet 1.1.5
httpcore 1.0.9
httptools 0.6.4
httpx 0.28.1
huggingface-hub 0.34.0
idna 3.10
ijson 3.4.0
inscriptis 2.6.0
interegular 0.3.3
ir_datasets 0.5.11
iso3166 2.1.1
Jinja2 3.1.6
jiter 0.10.0
joblib 1.5.1
jsonpath-ng 1.7.0
jsonschema 4.25.0
jsonschema-specifications 2025.4.1
lark 1.2.2
llguidance 0.7.30
llvmlite 0.44.0
lm-format-enforcer 0.10.11
lmdeploy 0.9.2
lxml 6.0.0
lz4 4.4.4
markdown-it-py 3.0.0
MarkupSafe 3.0.2
mdurl 0.1.2
mistral_common 1.8.2
mmengine-lite 0.10.7
modelscope 1.28.1
mpmath 1.3.0
msgpack 1.1.1
msgspec 0.19.0
multidict 6.6.3
multiprocess 0.70.16
nest-asyncio 1.6.0
networkx 3.5
ninja 1.11.1.4
numba 0.61.2
numpy 1.26.4
nvidia-cublas-cu12 12.6.4.1
nvidia-cuda-cupti-cu12 12.6.80
nvidia-cuda-nvrtc-cu12 12.6.77
nvidia-cuda-runtime-cu12 12.6.77
nvidia-cudnn-cu12 9.5.1.17
nvidia-cufft-cu12 11.3.0.4
nvidia-cufile-cu12 1.11.1.6
nvidia-curand-cu12 10.3.7.77
nvidia-cusolver-cu12 11.7.1.2
nvidia-cusparse-cu12 12.5.4.2
nvidia-cusparselt-cu12 0.6.3
nvidia-ml-py 12.575.51
nvidia-nccl-cu12 2.26.2
nvidia-nvjitlink-cu12 12.6.85
nvidia-nvtx-cu12 12.6.77
nvitop 1.5.2
openai 1.90.0
opencv-python-headless 4.12.0.88
openpyxl 3.1.5
outlines 1.1.1
outlines_core 0.1.26
packaging 25.0
pandas 2.3.1
partial-json-parser 0.2.1.1.post6
peft 0.14.0
pillow 11.3.0
pip 25.1.1
platformdirs 4.3.8
ply 3.11
prometheus_client 0.22.1
prometheus-fastapi-instrumentator 7.1.0
propcache 0.3.2
protobuf 6.31.1
psutil 7.0.0
py-cpuinfo 9.0.0
pyarrow 21.0.0
pybase64 1.4.1
pycountry 24.6.1
pycparser 2.22
pydantic 2.11.7
pydantic_core 2.33.2
pydantic-extra-types 2.10.5
Pygments 2.19.2
pynvml 12.0.0
python-dateutil 2.9.0.post0
python-dotenv 1.1.1
python-json-logger 3.3.0
python-multipart 0.0.20
pytz 2025.2
PyYAML 6.0.2
pyzmq 27.0.0
qwen-vl-utils 0.0.11
ray 2.48.0
referencing 0.36.2
regex 2024.11.6
requests 2.32.4
rich 14.1.0
rich-toolkit 0.14.8
rignore 0.6.4
rpds-py 0.26.0
safetensors 0.5.3
scikit-learn 1.7.1
scipy 1.16.0
sentence-transformers 5.1.0
sentencepiece 0.2.0
sentry-sdk 2.33.2
setuptools 79.0.1
shellingham 1.5.4
shortuuid 1.0.13
six 1.17.0
sniffio 1.3.1
soundfile 0.13.1
soupsieve 2.7
soxr 0.5.0.post1
starlette 0.47.2
sympy 1.14.0
termcolor 3.1.0
threadpoolctl 3.6.0
tiktoken 0.9.0
timm 1.0.19
tokenizers 0.21.2
torch 2.7.1
torchaudio 2.7.1
torchvision 0.22.1
tqdm 4.67.1
transformers 4.53.3
trec-car-tools 2.6
triton 3.3.1
typer 0.16.0
typing_extensions 4.14.1
typing-inspection 0.4.1
tzdata 2025.2
unlzw3 0.2.3
urllib3 2.5.0
uvicorn 0.35.0
uvloop 0.21.0
vllm 0.10.0
warc3-wet 0.2.5
warc3-wet-clueweb09 0.2.5
watchfiles 1.1.0
websockets 15.0.1
wheel 0.45.1
xformers 0.0.31
xgrammar 0.1.21
xxhash 3.5.0
yapf 0.43.0
yarl 1.20.1
zlib-state 0.1.9
Error traceback
是服务端卡住了么? 针对 vlm 模型,如果开启 tp 模式,建议使用 pytorch engine,即 --backend pytorch。turbomind在这种case下,有卡住风险。我们还在想办法解决。
是服务端卡住了么? 针对 vlm 模型,如果开启 tp 模式,建议使用 pytorch engine,即 --backend pytorch。turbomind在这种case下,有卡住风险。我们还在想办法解决。
@lvhan028 请问turbomind会卡住是什么原因?