燃
燃
@hsiangchun4 @mkamranr `No available shared memory broadcast block found in 60 seconds.` This is just normal output—vLLM might be performing time-consuming operations such as CUDA graph compilation, etc. You can...
@whwangovo According to the [community guide ](https://docs.vllm.ai/projects/recipes/en/latest/Qwen/Qwen3-VL.html#qwen3-vl-235b-a22b-instruct), you can try the following configuration: ``` vllm serve Qwen/Qwen3-VL-235B-A22B-Instruct \ --tensor-parallel-size 8 \ --max-model-len 128000 \ --async-scheduling ```
Yes, you can pass the local video file by: `file:// + your local absolute path`, for example: ```python video_url_for_local = "file:///your/local/path/to/v_3l7quTy4c2s.mp4" # file:// + your local absolute path video_url_for_remote =...
> 你好,可以问下你的Python版本,Pytorch版本和CUDA版本吗?我也正在部署VLLM,但遇到版本不匹配。 > > > python -m xformers.info > > WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for: > > PyTorch 2.5.1 with CUDA 1201 (you have 2.5.1+cu121)...
> > Yes, you can pass the local video file by: `file:// + your local absolute path`, for example: > > video_url_for_local = "file:///your/local/path/to/v_3l7quTy4c2s.mp4" # file:// + your local absolute...
> > > > Yes, you can pass the local video file by: `file:// + your local absolute path`, for example: > > > > video_url_for_local = "file:///your/local/path/to/v_3l7quTy4c2s.mp4" # file://...
> [@wulipc](https://github.com/wulipc) [@XyWzzZ](https://github.com/XyWzzZ) 您好 请问是怎么解决的问题?能不能上传一个openai接口风格 json请求体的示例?文档中我看到是自己写的代码,而不是直接通过postman或者curl请求的 你好,请看下 Readme 中的示例,你说的 `自己写的代码` 就是生成 openai接口风格 json请求体 的过程,祝好。
> 您好,这个输出True,但还是有上述xformers问题 > >  @hweidream 看起来是你的显卡不支持 flash_attn, `输出 True` 这个信息不准确。 
@618-github 参考文档中这个,视频需要先做 base64 编码,不能直接传路径: ```python import base64 import numpy as np from PIL import Image from io import BytesIO from openai import OpenAI from qwen_vl_utils import process_vision_info # Set OpenAI's...
Thank you for your feedback. vLLM is well-supported for the Qwen2.5-VL series models. However, our Docker image is designed for various scenarios (including deployment of vLLM services), so it may...