nf4 topic
List
nf4 repositories
neural-speed
346
Stars
38
Forks
Watchers
An innovative library for efficient LLM inference via low-bit quantization