vllm
vllm copied to clipboard
[Bugfix] add truncate_prompt_tokens to work offline, directly from LLM class.
Fixes #4507.
- added function _validate_prompt to make sure prompt is a right format and run some validations.
- truncate_prompt_tokens should take truncate the prompt to the left as seen in vllm/entrypoints/openai/serving_engine.py line 182.
@tdoublep can you help review this?
@simon-mo sure, will try to get to that later today