MiniGPT-4 Error with batch inference

Error with batch inference

Open insundaycathy opened this issue 1 year ago • 3 comments

Assertion error or out of bounds error when doing inference with batch size greater than 1: ../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [15,0,0], thread: [64,0,0] Assertion srcIndex < srcSelectDimSize failed. ../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [15,0,0], thread: [65,0,0] Assertion srcIndex < srcSelectDimSize failed. ../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block: [15,0,0], thread: [66,0,0] Assertion srcIndex < srcSelectDimSize failed. ... model runs normally when batchsize=1

error occurs in this piece of code: outputs = model.llama_model.generate( inputs_embeds=emb, max_new_tokens=max_new_tokens, stopping_criteria=stopping_criteria, num_beams=num_beams, do_sample=True, min_length=min_length, top_p=top_p, repetition_penalty=repetition_penalty, length_penalty=length_penalty, temperature=temperature)

Jun 09 '23 01:06 insundaycathy

how to inference in batches?

Jun 28 '23 02:06 ghost

the same problem. Do you write codes that support inference with batch?

Aug 31 '23 05:08 pixas

What is your max_new_tokens? I find that for batch inference that value should not be too large. For me 64 works in most cases (but the answer might be truncated)

Jan 25 '24 12:01 joslefaure

MiniGPT-4 MiniGPT-4 copied to clipboard

Error with batch inference

MiniGPT-4
MiniGPT-4 copied to clipboard