int8 topic

List int8 repositories

neural-compressor

2.2k
Stars
254
Forks
24
Watchers

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

retinaface

297
Stars
90
Forks
Watchers

Reimplement RetinaFace use C++ and TensorRT

yolov5_tensorrt_int8_tools

173
Stars
40
Forks
Watchers

tensorrt int8 量化yolov5 onnx模型

yolov5_tensorrt_int8

165
Stars
26
Forks
Watchers

TensorRT int8 量化部署 yolov5s 模型,实测3.3ms一帧!

RepVGG_TensorRT_int8

62
Stars
15
Forks
Watchers

RepVGG TensorRT int8 量化,实测推理不到1ms一帧!

ncnn-yolov4-int8

21
Stars
5
Forks
Watchers

NCNN+Int8+YOLOv4 quantitative modeling and real-time inference

neural-speed

346
Stars
38
Forks
Watchers

An innovative library for efficient LLM inference via low-bit quantization

Tensorrt-int8-quantization-pipline

56
Stars
3
Forks
Watchers

a simple pipline of int8 quantization based on tensorrt.

nanodet_tensorrt_int8

37
Stars
7
Forks
Watchers

nanodet int8 量化,实测推理2ms一帧!

YOLOv8-ONNX-TensorRT

35
Stars
3
Forks
Watchers

👀 Apply YOLOv8 exported with ONNX or TensorRT(FP16, INT8) to the Real-time camera