[Feature] Add support for Speculative Decoding

Open zyongye opened this issue 8 months ago • 1 comments

Hi team, I am new to LightLLM and looking to add speculative decoding support. First, add n-gram and then medusa/eagle. I am wondering if it is something the team is doing internally? If not, is there some starting point on the best place to add in the codebase?

Thanks

Apr 09 '25 21:04 zyongye

We are currently supporting speculative decoding for DeepSeek v3 MTP. There are no plans to support all types of speculative decoding techniques, as many of these technologies are more focused on academic research and specific scenario optimizations. If you need to implement a specific mode on your own, you might consider inheriting from base_backend.py to create your own inference backend.

Apr 10 '25 01:04 hiworldwzj