huu3301
huu3301
@ogabrielluiz I'm not sure whether the version is v0.6.15. I used the command "git pull origin dev" yesterday and found this issue.
I know where is the bug: In RetrievalQA.py, if return_source_documents is false, the result.get("source_documents") is None, this causes a bug in self.to_records function. 
@lvhan028 环境: CentOS V100显卡 Driver Version: 550.54.14 lmdeploy==0.6.0 torch==2.3.1 tranformers==4.44.2 使用命令部署InternVL2-8B NVIDIA_VISIBLE_DEVICES=1 lmdeploy serve api_server /data/models/OpenGVLab/InternVL2-8B --tp 1 --server-port 11251 --model-name InternVL2-8B --cache-max-entry-count 0.25 推理时出现报错: Assertion fail: /lmdeploy/src/turbomind/kernels/attention/attention.cu:35 做过尝试: 1、lmdeploy降级到0.5.3...
In version v1.1.3, I encountered a similar problem. The time shown in the log is 12 hours earlier than the actual time. The value of LOG_TZ was set to Asia/Shanghai....
0.9.0版本偶现同样的问题 qwen-1 | 2025-07-02 19:34:16,159 - lmdeploy - INFO - turbomind.py:711 - [async_stream_infer] CancelledError qwen-1 | 2025-07-02 19:34:16,159 - lmdeploy - ERROR - async_engine.py:599 - [safe_run] exception caught: CancelledError Cancelled...
@forrestsocool @dosu The same problem occur when I use the workflow as tool, that is because the output type of the tool is string, so the next block can not...