colpali Question: Is it possible to use Colqwen with vLLM?

vLLM should support Qwen models, but Colqwen isn't a simple Qwen fine-tune, so I am wondering whether someone used the two successfully.

Feb 13 '25 14:02 kubni

you use the colqwen to retrive the image and then pass it into the vllm

Mar 12 '25 23:03 abdelkareemkobo

It should be possible, but would require other packages to get working smoothly.

With vllm serve here are the caveats:

You might need the qwen_vl_utils and other packages
you need an hf architecture override: --hf-overrides '{"architectures": ["YourColPaliArchitecture"]}'
Also, you'll need to ensure that the vllm runner is a pooling runner and other parameters

--runner pooling \
--trust-remote-code \
--max-model-len 8192 # Where this is the total size of the patched embedding, normally 16 * patch-embedding-size

Aug 26 '25 14:08 nickmitchko

You can run this: https://github.com/athrael-soju/fastapi-nextjs-colpali-template/tree/main/colpali

Aug 30 '25 19:08 athrael-soju

What's the status of this issue? If it's still not solved, I am happy to support this in vLLM cc @tonywu71 @ManuelFay

Oct 25 '25 06:10 yichuan-w

Hello ! Neither I nor Tony will have time working on this ! If you know how to, you are more more than welcome to and we will make sure to credit you and share the work ! The best models to use are probably the merged checkpoints, not the LORA ones. Best, Manu

Oct 25 '25 10:10 ManuelFay