[Usage]: I would like to know how to transfer fps and max_pixels after starting a qwen2vl-7b service using vllm?
Your current environment
The output of `python collect_env.py`
How would you like to use vllm
I want to run inference of a [specific model](put link here). I don't know how to integrate it with vllm.
Before submitting a new issue...
- [X] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
You should sample the video frames outside of vLLM.
You can set max_pixels via the mm_processor_kwargs key (which is passed alongside multi_modal_data) in offline inference. This isn't supported in online inference though, so if you're using vllm serve then you have to pass it at startup time.
I passed the max_pixels parameter through mm_processor_kwargs, but encountered an error: api_server.py error argument --mm_processor_kwargs invalid loads value max_pixels:798 command: python -m vllm.entrypoints.openai.api_server --host 0.0.0.0 --port 8088 --model /app/qwen2vl-7b --tensor-parrallel 1 --gpu-memory-utilization 0.95 --served-model-name qwen2vl-7b --mm_processor_kwargs {"max_pixels":798} --trust-remote-code
You need to pass it as a JSON string. You can enclose it with single quotes, i.e. '{"max_pixels":798}'
You should sample the video frames outside of vLLM.
You can set
max_pixelsvia themm_processor_kwargskey (which is passed alongsidemulti_modal_data) in offline inference. This isn't supported in online inference though, so if you're usingvllm servethen you have to pass it at startup time.
@DarkLight1337 Hi, for Qwen2.5-VL online inference, I expect to pass the fps parameter to mm_processor_kwargs, which is required to calculate the second_pre_grid_t parameter accurately. But I see that extra_body does not support the mm_processor_kwargs parameter at present. I would like to ask, do we have plans to support passing the fps parameter through mm_processor_kwargs by extra_body or something else?
https://docs.vllm.ai/en/v0.6.3/serving/openai_compatible_server.html
We don't yet have plans to add this. Feel free to open a PR and contribute to this!
We don't yet have plans to add this. Feel free to open a PR and contribute to this!
@DarkLight1337 OK, let me confirm, the optimal solution is to pass mm_processor_kwargs through extra_body, right?
Yes, I think that would work.