LLM-VM add parallel sampling using vllm

add parallel sampling using vllm

Open daspartho opened this issue 2 years ago • 1 comments

close #370

adds support for parallel sampling using vllm library when num_return_sequences in generation kwargs is > 1 and the model is supported by vllm (currently all hf models in llm-vm)

TODO: handle dependencies

Nov 29 '23 16:11 daspartho

made suggested changes. vllm_support is set to true by default and needs to be set false explicitly for unsupported models.

Dec 04 '23 08:12 daspartho

LLM-VM LLM-VM copied to clipboard

add parallel sampling using vllm

LLM-VM
LLM-VM copied to clipboard