Inference Ovis2 with vllm
How can we inference Ovis2 using vllm
List of Multimodal supported in vllm : https://docs.vllm.ai/en/latest/models/supported_models.html#list-of-multimodal-language-models
They state Ovis2 supprt has been merged into v0.9.0 https://github.com/vllm-project/vllm/pull/15826, but I cannot run quantized version AIDC-AI/Ovis2-8B-GPTQ-Int4
ValueError: There is no module or parameter named 'visual_tokenizer.backbone.trunk.blocks.0.mlp.fc1.weight' in Ovis
@vaclcer I second the issue with GPTQ quantization. I am getting the same error regardless of gptq or gptq_marlin quantization config.
same question when use Ovis2-2B-GPTQ-Int4