Results 67 comments of

> [@ZhouJYu](https://github.com/ZhouJYu) I tested locally and didn't encounter any errors; the results came out normally. Have you tried running the cases mentioned in the README? > > ``` > 这张图片显示了一辆停放在户外的地砖铺成的地面上的两轮机动车,摩托车旁边有一个软行李箱。摩托车被放置得靠得很近,触碰到了一个黄色的路面标记。地面的清洗条地上绘制了白色编号“106”,表示它作为停放位置的指南。周围有一些绿色的花朵和灌木丛,是人们日常活动频繁的地方,可能是住宅区。背景中有一个建筑物的一部分,可能是住宅大厦。...

> 本地推理时报错 `ImportError: cannot import name 'Qwen2_5_VLForConditionalGeneration' from 'transformers' (/root/miniconda3/envs/llama_factory/lib/python3.11/site-packages/transformers/__init__.py)` > > transformers版本 `root@vllm-dev:/home/aigc_worker/aigc/vllm# pip show transformers Name: transformers Version: 4.48.2 Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow...

> > It is recommended to use the video path directly. You can use `file://YOUR/VIDEO/PATH` to set the `video_url`, such as `file://datasets/videos/video_1.mp4` And you also need to set `--allowed-local-media-path /`...

Hello, the Instruct model does not include a "think" process, and `model.generate` produces non-streaming output. Therefore, the time you wait for `generate` is actually the total time required to generate...

> oops, sorry about that! 😢 you can just replace `Qwen3VLForConditionalGeneration` with `Qwen3VLMoeForConditionalGeneration` for moe models! I would update it asap! @JJJYmmm Perhaps we could use `AutoModelForImageTextToText` to cover both...

@JepsonWong 由于 transformers 和 vLLM 不同的算子实现不一样,存在一定的数值波动,导致最后的结果可能有略微的不一致。方便的话,辛苦发一下你的图像给我,我看下这个 diff 是否符合预期的。

@zekunhao1995 FYI: https://github.com/QwenLM/Qwen3-VL/issues/1643#issuecomment-3424492396

啥机器配置? 看启动命令没啥问题,可以尝试 fp8 或更小权重试试

> [@wulipc](https://github.com/wulipc) 是8*h800,按理来说应该没问题吧 建议先按照 readme 中命令试试,然后再排查问题: ``` python -m sglang.launch_server \ --model-path Qwen/Qwen3-VL-235B-A22B-Instruct \ --host 0.0.0.0 \ --port 22002 \ --tp 8 ```

可以发下报错的具体信息或者尝试使用 llamacpp: https://github.com/ggml-org/llama.cpp/pull/16780