sparseml
sparseml copied to clipboard
[Feature Branch] Quant modifier UX
Quantization Modifier UX Update
Description
This PR refactors the quantization modifiers to enhance user experience and simplify the system architecture. It is based off of changes from ~the sa/quant_mod_refactor~ main branch, all subsequent changes will be merged as smaller bites sized PRs into this. Key updates include:
- [x] Decoupling Wanda and SparseGPT https://github.com/neuralmagic/sparseml/pull/2266
- [ ] #2272
- Decoupling SparseGPT and GPTQ
- Removing quantization features from SparseGPT
- Adding quantization features to GPTQ
- [ ] #2273
- Updating GPTQ modifier UX to accept config groups
- [ ] #2281 Preserve base sparsity in GPTQ
- [ ] #2282 Preserve sparsity mask in SparseGPT
Reference Documentation
For more detailed information about the changes and their impact, please refer to the documentation here.