Question: Is it possible to use Colqwen with vLLM?
vLLM should support Qwen models, but Colqwen isn't a simple Qwen fine-tune, so I am wondering whether someone used the two successfully.
you use the colqwen to retrive the image and then pass it into the vllm
It should be possible, but would require other packages to get working smoothly.
With vllm serve here are the caveats:
- You might need the
qwen_vl_utilsand other packages - you need an hf architecture override:
--hf-overrides '{"architectures": ["YourColPaliArchitecture"]}' - Also, you'll need to ensure that the vllm runner is a pooling runner and other parameters
--runner pooling \
--trust-remote-code \
--max-model-len 8192 # Where this is the total size of the patched embedding, normally 16 * patch-embedding-size
You can run this: https://github.com/athrael-soju/fastapi-nextjs-colpali-template/tree/main/colpali
What's the status of this issue? If it's still not solved, I am happy to support this in vLLM cc @tonywu71 @ManuelFay
Hello ! Neither I nor Tony will have time working on this ! If you know how to, you are more more than welcome to and we will make sure to credit you and share the work ! The best models to use are probably the merged checkpoints, not the LORA ones. Best, Manu