text-generation-inference icon indicating copy to clipboard operation
text-generation-inference copied to clipboard

Does tgi support image resize for qwen2-vl pipeline?

Open AHEADer opened this issue 10 months ago • 1 comments
trafficstars

System Info

I try to deploy a qwen2-vl fine-tuned model with tgi and vllm, and I've found some results between these two frameworks are different. Seems that tgi consume more tokens compared to vLLM. I checked TGI's code and seems there miss the image resize logic? For Qwen2-VL pipeline, we will resize the image based on two args max_pixels and min_pixels.

Information

  • [x] Docker
  • [ ] The CLI directly

Tasks

  • [ ] An officially supported command
  • [ ] My own modifications

Reproduction

Deploy a Qwen2-VL-7B model on the inference endpoint, and upload a large image will trigger an error that the input tokens are larger than 32768


### Expected behavior

The server will resize the image based on preprocessor_config.json(max_pixels and min_pixels) and make sure the image tokens will not be too many for a request.

AHEADer avatar Jan 16 '25 13:01 AHEADer

@AHEADer can you provide the docker command you are using with Qwen2-VL-7B?

ashwani-bhat avatar Jan 25 '25 15:01 ashwani-bhat