Benjamin Fineran
Results
11
issues of
Benjamin Fineran
This PR adds an `HFQuantizer` for the [compressed-tensors](https://github.com/neuralmagic/compressed-tensors) library. Supported quantization features include: * FP8, INT4, INT8 (for Q/DQ arbitrary precision is allowed for INT) * Activation quantization (static) *...