nf4 topic

List nf4 repositories

neural-speed

346
Stars
38
Forks
Watchers

An innovative library for efficient LLM inference via low-bit quantization