TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

llava batch infer, only the result corresponding to the longest prompt is correct, while other results are incorrect

Open lss15151161 opened this issue 8 months ago • 3 comments

version: TensorRT-LLM 0.10.0 the official script(TensorRT-LLM/examples/multimodal/run.py) use same prompt repeat to form a batch. but if I use different prompts to form a batch, the result is incorrect. how to solve it? because the result corresponding to the longest prompt is correct, I think the reason is padding. image

if i use the same prompts, the result is correct image

lss15151161 avatar Jul 03 '24 03:07 lss15151161