vllm
vllm copied to clipboard
[Bug fix][Core] fixup ngram not setup correctly
ngram_prompt_lookup_max/ngram_prompt_lookup_min need to be past through SpecDecodeWorker.create_worker's draft_worker_kwargs.
If those two doesn't get past, now there will be exception as dict cannot pop those two keys.
cc @comaniac
+1. Let's get a test covering this path.
Why was it not covered by existing tests?
Why was it not covered by existing tests?
I guess existing tests directly initiated the worker, but this is more like an end-to-end path starting from a higher level?
Why was it not covered by existing tests?
It is for current ngram still use draft model set as target model to get some info like vocab size. In this failure, ngram testcase is actually turned into multistep case with draft model same as target model...
I add a check assert in conftest to ensure we current in ngram running path, when corresponding param is set.
Retrying test infra failure
@cadedaniel this should be able to merge.
Spec decode tests start failing in main branch after this PR https://buildkite.com/vllm/ci/builds/6784#018f551e-d727-491c-be34-9d9fa29f4ea4
The fix PR is here: #4672 Meanwhile, @cadedaniel adjusted the test config to workaround this issue in #4592, so we should be good after merging this one.