smoothquant
smoothquant copied to clipboard
How does it compares to Deepspeed?
https://www.microsoft.com/en-us/research/blog/deepspeed-accelerating-large-scale-model-inference-and-training-via-system-optimizations-and-compression/#:~:text=Flexible%20quantization%20support https://github.com/microsoft/DeepSpeed Benchmarks would be nice :)