sparseml
sparseml copied to clipboard
Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
We need to update the transformers version to support QWEN2-MOE model, see: https://github.com/huggingface/transformers/releases/tag/v4.40.0 _(it also, fits into our goal to be constantly matching the latest release)_ ## Important changes ####...
- Blocked on k8 runners being available. Only aws runners currently work
**Describe the bug** When exporting the YOLOv8s (pruned50-quant, model.pt from sparsezoo) model via the ONNX exporter (sparseml.ultralytics.export_onnx), its performance noticeably decreases compared to the ONNX model available in SparseZoo **Expected...
# Summary - Add a step to publish the nightly wheel using the nm-action: https://github.com/neuralmagic/nm-actions/blob/main/actions/publish-whl/action.yml - Once built, updated to add in a step to build the nightly container using...
This PR enhances the user experience of the `GPTQModifier` by allowing it to directly accept quantization-related arguments, such as `config_groups`. This change simplifies the configuration process, enabling users to specify...
This PR introduces a structural change by separating concerns between quantization and sparsification. A new `GPTQModifier` is extracted from the existing `SparseGPTModifier`. This ensures that each class now has a...
Bumps [jinja2](https://github.com/pallets/jinja) from 3.0.1 to 3.1.4. Release notes Sourced from jinja2's releases. 3.1.4 This is the Jinja 3.1.4 security release, which fixes security issues and bugs but does not otherwise...