model-compression topic

List model-compression repositories

MicroNet_OSI-AI

18
Stars
6
Forks
Watchers

(NeurIPS-2019 MicroNet Challenge - 3rd Winner) Open source code for "SIPA: A simple framework for efficient networks"

research-paper-summaries

17
Stars
0
Forks
Watchers

A directory with some interesting research paper summaries in the field of Deep Learning

laser

361
Stars
26
Forks
Watchers

The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction

Multistage_Pruning

16
Stars
3
Forks
Watchers

Cheng-Hao Tu, Jia-Hong Lee, Yi-Ming Chan and Chu-Song Chen, "Pruning Depthwise Separable Convolutions for MobileNet Compression," International Joint Conference on Neural Networks, IJCNN 2020, July 20...

QuantEase

17
Stars
1
Forks
Watchers

QuantEase, a layer-wise quantization framework, frames the problem as discrete-structured non-convex optimization. Our work leverages Coordinate Descent techniques, offering high-quality solutions wit...

task-aware-distillation

20
Stars
3
Forks
Watchers

Less is More: Task-aware Layer-wise Distillation for Language Model Compression (ICML2023)

KVQuant

286
Stars
25
Forks
Watchers

[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

CPSCA

16
Stars
4
Forks
Watchers

Code for paper "Channel Pruning Guided by Spatial and Channel Attention for DNNs in Intelligent Edge Computing"