batch inference

Open jtan1102 opened this issue 1 year ago • 1 comments

Hi authors,

I want to test the performance of the Mistral7B on the test dataset. Is it only possible to do single sample inference (with model. generate(...))? Are there any methods to accelerate the process?

Thanks

Jun 05 '24 14:06 jtan1102

You can use: for input_ids, output_ids in zip(batched_inputs.input_ids, batched_outputs):

or refer to https://github.com/ggerganov/llama.cpp

Jun 17 '24 05:06 THUchenzhou