PAN Jiacheng
PAN Jiacheng
求问你这个问题解决了吗?我也遇到类似的
https://github.com/vllm-project/vllm/issues/15185 is a similar issue (but on Qwen2.5-VL)
btw, I also tested it by switching to V0. V0 works fine, so the issue is with V1.
``` completions: List[RequestOutput] = self.inference_engine.generate( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/tiger/.local/lib/python3.11/site-packages/vllm/utils.py", line 1072, in inner return fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^ File "/home/tiger/.local/lib/python3.11/site-packages/vllm/entrypoints/llm.py", line 465, in generate outputs = self._run_engine(use_tqdm=use_tqdm) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/tiger/.local/lib/python3.11/site-packages/vllm/entrypoints/llm.py", line 1375,...
Update: after switching to V0, it can run for longer without such errors. But after some time, I still got the error: ``` File "/home/tiger/.local/lib/python3.11/site-packages/vllm/model_executor/models/qwen2_vl.py", line 1379, in forward inputs_embeds...
> I have seen this occur when sending random inputs to the model, one might accidentally include the token in the random distribution leading to errors. If not this, maybe...
@DarkLight1337 @Isotr0py Hi guys, I understand that this issue might be specific to Qwen and might be hard to fix. Rather than locating the issue in the code and fixing...
Updates: I figured that this might have something to do with special tokens being generated. I'm working on a fix but setting a small list of "bad_words" can cause CUDA...