inference-efficiency topic

List inference-efficiency repositories

Moonlit

72
Stars
7
Forks
Watchers

This is a collection of our research on efficient AI, covering hardware-aware NAS and model compression.

GlimpsePrune

85
Stars
1
Forks
85
Watchers

Official repository of the paper "A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models"