vllm icon indicating copy to clipboard operation
vllm copied to clipboard

[v1] Support allowed_token_ids in v1 Sampler

Open houseroad opened this issue 1 week ago • 3 comments

Follow the implementation in vllm/entrypoints/openai/logits_processors.py.

The idea is straightforward, adding a [batch_size x vocab_size] mask tensor, and leverage a list of bools to determine whether to do the inplace masked fill.

  • [x] add test
  • [x] move some verification in _get_allowed_token_ids_logits_processor to SamplingParam.

Test with

  • pytest tests/v1/sample/test_sampler.py
  • pytest tests/v1/worker/test_gpu_input_batch.py

houseroad avatar Feb 13 '25 07:02 houseroad