fp8 topic
List
fp8 repositories
TransformerEngine
1.8k
Stars
306
Forks
Watchers
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization i...
MS-AMP
510
Stars
42
Forks
Watchers
Microsoft Automatic Mixed Precision Library
neural-speed
346
Stars
38
Forks
Watchers
An innovative library for efficient LLM inference via low-bit quantization
flux-fp8-api
264
Stars
37
Forks
Watchers
Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.