hxyghostor issues

Repositories
Issues
Comments

Results 2 issues of


                                            hxyghostor

VLM inference performance

I use Qwen2-vl-2b to do classification task, latency 300ms and QPS 5 with A10 is the performance normal? quantize awq "max_pixels": 256*28*28 prompt text 100 token inference tokens 20

是否支持VLM 输出logprobs

Can VLLM(Qwen2-vl-2b-4bit) support logprobs output? I always get None ![Image](https://github.com/user-attachments/assets/cec61a06-ed86-4324-8206-64b1077c009b)