vllm [Bug fix][Core] fixup ngram not setup correctly

[Bug fix][Core] fixup ngram not setup correctly

Open leiwen83 opened this issue 9 months ago • 6 comments

ngram_prompt_lookup_max/ngram_prompt_lookup_min need to be past through SpecDecodeWorker.create_worker's draft_worker_kwargs.

If those two doesn't get past, now there will be exception as dict cannot pop those two keys.

May 02 '24 09:05 leiwen83

cc @comaniac

May 02 '24 14:05 rkooo567

+1. Let's get a test covering this path.

May 02 '24 15:05 cadedaniel

Why was it not covered by existing tests?

May 02 '24 15:05 cadedaniel

Why was it not covered by existing tests?

I guess existing tests directly initiated the worker, but this is more like an end-to-end path starting from a higher level?

May 02 '24 15:05 comaniac

Why was it not covered by existing tests?

It is for current ngram still use draft model set as target model to get some info like vocab size. In this failure, ngram testcase is actually turned into multistep case with draft model same as target model...

I add a check assert in conftest to ensure we current in ngram running path, when corresponding param is set.

May 03 '24 03:05 leiwen83

Retrying test infra failure

May 06 '24 23:05 cadedaniel

@cadedaniel this should be able to merge.

May 07 '24 17:05 comaniac

Spec decode tests start failing in main branch after this PR https://buildkite.com/vllm/ci/builds/6784#018f551e-d727-491c-be34-9d9fa29f4ea4

May 08 '24 16:05 simon-mo

The fix PR is here: #4672 Meanwhile, @cadedaniel adjusted the test config to workaround this issue in #4592, so we should be good after merging this one.

May 08 '24 16:05 comaniac

vllm vllm copied to clipboard

[Bug fix][Core] fixup ngram not setup correctly

vllm
vllm copied to clipboard