FastChat
FastChat copied to clipboard
why make the n= fixed in vllm worker generate_stream method
trafficstars
could you explain the purpose make the n=1 fixed
I want to call the vllm worker directly, but I can not get multipe choice. I checked the code, find this
in generate_stream method
sampling_params = SamplingParams(
n=1,
temperature=temperature,
top_p=top_p,
use_beam_search=use_beam_search,
stop=list(stop),
stop_token_ids=stop_token_ids,
max_tokens=max_new_tokens,
top_k=top_k,
presence_penalty=presence_penalty,
frequency_penalty=frequency_penalty,
best_of=best_of,
)