speculative-decoding topics

intel-extension-for-transformers

2.1k

Stars

209

Forks

Watchers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

intel

4-bits

attention-sink

chatbot

chatpdf

BigLittleDecoder

85

Stars

10

Forks

Watchers

[NeurIPS'23] Speculative Decoding with Big Little Decoder

kssteven418

decoding

efficient-inference

fast-inference

llm

SpecDec

31

Stars

0

Forks

Watchers

Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)

hemingkx

non-autoregressive

speculative-decoding

aphrodite-engine

1.0k

Stars

112

Forks

Watchers

Large-scale LLM inference engine

PygmalionAI

api-rest

inference-engine

machine-learning

avx512

Sequoia

305

Stars

31

Forks

Watchers

scalable and robust tree-based speculative decoding algorithm

Infini-AI-Lab

efficiency

inference

llm

speculative-decoding

TriForce

209

Stars

12

Forks

Watchers

[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Infini-AI-Lab

acceleration

efficiency

inference

llm

EAGLE

780

Stars

79

Forks

Watchers

Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)

SafeAILab

large-language-models

llm-inference

speculative-decoding

speculative_decoding.c

16

Stars

2

Forks

Watchers

minimal C implementation of speculative decoding based on llama2.c

mscheong01

artificial-intelligence

c

llama2

llm

REST

163

Stars

10

Forks

Watchers

REST: Retrieval-Based Speculative Decoding, NAACL 2024

FasterDecoding

llm-inference

retrieval

speculative-decoding