PainlessInferenceAcceleration
PainlessInferenceAcceleration copied to clipboard
论文里看到Table 8 Inference Latency with Lookahead for vLLM.
请问vLLM+lookahead这部分代码有吗?还是要改这个代码,如果需要改,怎么改呢,才能结合vllm