[Feature Branch] Quant modifier UX

Open rahul-tuli opened this issue 1 year ago • 0 comments

Quantization Modifier UX Update

Description

This PR refactors the quantization modifiers to enhance user experience and simplify the system architecture. It is based off of changes from ~the sa/quant_mod_refactor~ main branch, all subsequent changes will be merged as smaller bites sized PRs into this. Key updates include:

[x] Decoupling Wanda and SparseGPT https://github.com/neuralmagic/sparseml/pull/2266
[ ] #2272
- Decoupling SparseGPT and GPTQ
- Removing quantization features from SparseGPT
- Adding quantization features to GPTQ
[ ] #2273
- Updating GPTQ modifier UX to accept config groups
- [ ] #2281 Preserve base sparsity in GPTQ
  - [ ] #2282 Preserve sparsity mask in SparseGPT

Reference Documentation

For more detailed information about the changes and their impact, please refer to the documentation here.

May 02 '24 15:05 rahul-tuli

sparseml sparseml copied to clipboard

[Feature Branch] Quant modifier UX

Quantization Modifier UX Update

Description

Reference Documentation

sparseml
sparseml copied to clipboard