efficient-inference topic

List efficient-inference repositories

graphless-neural-networks

80
Stars
20
Forks
Watchers

[ICLR 2022] Code for Graph-less Neural Networks: Teaching Old MLPs New Tricks via Distillation (GLNN)

SqueezeLLM

632
Stars
42
Forks
Watchers

[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization

DeepCache

767
Stars
36
Forks
Watchers

[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free

LLMCompiler

1.4k
Stars
104
Forks
Watchers

[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling

BigLittleDecoder

85
Stars
10
Forks
Watchers

[NeurIPS'23] Speculative Decoding with Big Little Decoder

speculative-decoding

160
Stars
14
Forks
Watchers

Explorations into some recent techniques surrounding speculative decoding

lzu

46
Stars
5
Forks
Watchers

Code for Learning to Zoom and Unzoom (CVPR 2023)

TinyML-Benchmark-NNs-on-MCUs

31
Stars
11
Forks
Watchers

Code for WF-IoT paper 'TinyML Benchmark: Executing Fully Connected Neural Networks on Commodity Microcontrollers'

triple-wins

24
Stars
7
Forks
Watchers

[ICLR 2020] ”Triple Wins: Boosting Accuracy, Robustness and Efficiency Together by Enabling Input-Adaptive Inference“

LightGaussian

551
Stars
49
Forks
Watchers

[NeurIPS 2024 Spotlight]"LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS", Zhiwen Fan, Kevin Wang, Kairun Wen, Zehao Zhu, Dejia Xu, Zhangyang Wang