model-compression topic
MicroNet_OSI-AI
(NeurIPS-2019 MicroNet Challenge - 3rd Winner) Open source code for "SIPA: A simple framework for efficient networks"
research-paper-summaries
A directory with some interesting research paper summaries in the field of Deep Learning
laser
The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
Multistage_Pruning
Cheng-Hao Tu, Jia-Hong Lee, Yi-Ming Chan and Chu-Song Chen, "Pruning Depthwise Separable Convolutions for MobileNet Compression," International Joint Conference on Neural Networks, IJCNN 2020, July 20...
QuantEase
QuantEase, a layer-wise quantization framework, frames the problem as discrete-structured non-convex optimization. Our work leverages Coordinate Descent techniques, offering high-quality solutions wit...
task-aware-distillation
Less is More: Task-aware Layer-wise Distillation for Language Model Compression (ICML2023)
KVQuant
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
CPSCA
Code for paper "Channel Pruning Guided by Spatial and Channel Attention for DNNs in Intelligent Edge Computing"
awesome-compression
模型压缩的小白入门教程
Pruning-Deep-Neural-Networks-from-a-Sparsity-Perspective
[ICLR 2023] Pruning Deep Neural Networks from a Sparsity Perspective