vllm icon indicating copy to clipboard operation
vllm copied to clipboard

[Bug fix][Core] fixup ngram not setup correctly

Open leiwen83 opened this issue 9 months ago • 6 comments

ngram_prompt_lookup_max/ngram_prompt_lookup_min need to be past through SpecDecodeWorker.create_worker's draft_worker_kwargs.

If those two doesn't get past, now there will be exception as dict cannot pop those two keys.

leiwen83 avatar May 02 '24 09:05 leiwen83

cc @comaniac

rkooo567 avatar May 02 '24 14:05 rkooo567

+1. Let's get a test covering this path.

cadedaniel avatar May 02 '24 15:05 cadedaniel

Why was it not covered by existing tests?

cadedaniel avatar May 02 '24 15:05 cadedaniel

Why was it not covered by existing tests?

I guess existing tests directly initiated the worker, but this is more like an end-to-end path starting from a higher level?

comaniac avatar May 02 '24 15:05 comaniac

Why was it not covered by existing tests?

It is for current ngram still use draft model set as target model to get some info like vocab size. In this failure, ngram testcase is actually turned into multistep case with draft model same as target model...

I add a check assert in conftest to ensure we current in ngram running path, when corresponding param is set.

leiwen83 avatar May 03 '24 03:05 leiwen83

Retrying test infra failure

cadedaniel avatar May 06 '24 23:05 cadedaniel

@cadedaniel this should be able to merge.

comaniac avatar May 07 '24 17:05 comaniac

Spec decode tests start failing in main branch after this PR https://buildkite.com/vllm/ci/builds/6784#018f551e-d727-491c-be34-9d9fa29f4ea4

simon-mo avatar May 08 '24 16:05 simon-mo

The fix PR is here: #4672 Meanwhile, @cadedaniel adjusted the test config to workaround this issue in #4592, so we should be good after merging this one.

comaniac avatar May 08 '24 16:05 comaniac