[ICML 2025] Reward-guided Speculative Decoding (RSD) for efficiency and effectiveness.
BaohaoLiao
PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation
AutonomicPerfectionist