Results 1 comments of Sam Hjelmfelt

I encountered this with Mistral 7b on an A10 using AsyncLLMEngine when pending requests increased above 0. Removing n and best_of from the SamplingParams is a workaround.