vllm [V1] Support bad_words in sampler

[V1] Support bad_words in sampler

Open 22quinn opened this issue 1 week ago • 1 comments

Follows the implementation in vllm/logits_process.py

There were two choices on where to tokenise bad words and the second path was taken:

Pass both bad_words and tokenizer via requests, but realized CachedTokenizer could not be pickled
Tokenize bad words by adding SamplingParams.update_from_tokenizer and invoke from engine processor.

E2E test: Prompt: 'Hello, my name is', temperature=0, top_p=1.0 Generated text:

No bad_words: ' J.C. and I am a student at the University of California, Berkeley'
bad_words=["at the"]: ' J.C. and I am a student at a private school in the city'
bad_words=["at the", "school"]: ' J.C. and I am a student at a private college in the city'

from vllm import LLM, SamplingParams
prompts = [
    "Hello, my name is",
]
sampling_params = SamplingParams(temperature=0, top_p=1.0)
sampling_params = SamplingParams(temperature=0, top_p=1.0, bad_words=["at the"])
sampling_params = SamplingParams(temperature=0, top_p=1.0, bad_words=["at the", "school"])
llm = LLM(model="facebook/opt-125m", enforce_eager=True)
outputs = llm.generate(prompts, sampling_params)

for i, output in enumerate(outputs):
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Prompt: {prompt!r}")
    print(f"Generated text: {generated_text!r}")

part of #13058

Feb 17 '25 05:02 22quinn

vllm vllm copied to clipboard

[V1] Support bad_words in sampler

vllm
vllm copied to clipboard