post-training-quantization topic

List post-training-quantization repositories

Adventures-in-TensorFlow-Lite

168
Stars
33
Forks
Watchers

This repository contains notebooks that show the usage of TensorFlow Lite for quantizing deep neural networks.

micronet

2.2k
Stars
477
Forks
Watchers

micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Ar...

TinyNeuralNetwork

716
Stars
117
Forks
Watchers

TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.

neural-compressor

2.2k
Stars
254
Forks
24
Watchers

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Sparsebit

321
Stars
39
Forks
Watchers

A model compression and acceleration toolbox based on pytorch.

FQ-ViT

283
Stars
47
Forks
Watchers

[IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer

static_quantization

37
Stars
7
Forks
Watchers

Post-training static quantization using ResNet18 architecture

SqueezeLLM

632
Stars
42
Forks
Watchers

[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization

q-diffusion

313
Stars
21
Forks
Watchers

[ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.

QLLM

33
Stars
2
Forks
Watchers

[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models"