vllm
vllm copied to clipboard
[V1] Support bad_words in sampler
Follows the implementation in vllm/logits_process.py
There were two choices on where to tokenise bad words and the second path was taken:
- Pass both
bad_words
andtokenizer
via requests, but realizedCachedTokenizer
could not be pickled - Tokenize bad words by adding
SamplingParams.update_from_tokenizer
and invoke from engine processor.
E2E test: Prompt: 'Hello, my name is', temperature=0, top_p=1.0 Generated text:
- No bad_words: ' J.C. and I am a student at the University of California, Berkeley'
- bad_words=["at the"]: ' J.C. and I am a student at a private school in the city'
- bad_words=["at the", "school"]: ' J.C. and I am a student at a private college in the city'
from vllm import LLM, SamplingParams
prompts = [
"Hello, my name is",
]
sampling_params = SamplingParams(temperature=0, top_p=1.0)
sampling_params = SamplingParams(temperature=0, top_p=1.0, bad_words=["at the"])
sampling_params = SamplingParams(temperature=0, top_p=1.0, bad_words=["at the", "school"])
llm = LLM(model="facebook/opt-125m", enforce_eager=True)
outputs = llm.generate(prompts, sampling_params)
for i, output in enumerate(outputs):
prompt = output.prompt
generated_text = output.outputs[0].text
print(f"Prompt: {prompt!r}")
print(f"Generated text: {generated_text!r}")
part of #13058