hxyghostor
Results
2
issues of
hxyghostor
I use Qwen2-vl-2b to do classification task, latency 300ms and QPS 5 with A10 is the performance normal? quantize awq "max_pixels": 256*28*28 prompt text 100 token inference tokens 20
Can VLLM(Qwen2-vl-2b-4bit) support logprobs output? I always get None 