LLaVA
LLaVA copied to clipboard
Error when inference with batchsize > 1
Question
Hi:
we use output_ids = model.generate( input_ids, images=image_tensor.unsqueeze(0).half().cuda(), do_sample=True, temperature=0.2, max_new_tokens=1024, stopping_criteria=[stopping_criteria])
to do the inference. it works when input_ids and images both batchsize =1, however it always fail when batchsize is larger than 1. We have removed the stopping criteria, still the error appears. any suggestion on how to fix it?
Do you mean it will give error when image batch size is larger than 1?
Also, some error log will be helpful
maybe it is the conversation train bug, vicuna also does not support batch>1 inference