LLM-VM icon indicating copy to clipboard operation
LLM-VM copied to clipboard

add parallel sampling using vllm

Open daspartho opened this issue 2 years ago • 1 comments

close #370

adds support for parallel sampling using vllm library when num_return_sequences in generation kwargs is > 1 and the model is supported by vllm (currently all hf models in llm-vm)

TODO: handle dependencies

daspartho avatar Nov 29 '23 16:11 daspartho

made suggested changes. vllm_support is set to true by default and needs to be set false explicitly for unsupported models.

daspartho avatar Dec 04 '23 08:12 daspartho