sparseml icon indicating copy to clipboard operation
sparseml copied to clipboard

Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

Results 165 sparseml issues
Sort by recently updated
recently updated
newest added

This code updates an existing but unused model analyzer (AnalyzerModule) object that computes forward FLOPs, parameters, prunable parameters, and zeroed parameters model-wide. Note that this is somewhat redundant with the...

Code that I used to evaluate Llama-2-7b model on CNN/DailyMail dataset with the Rouge score.

## Short-term Work Issues or PRs that the NM team are planning to tackle for this quarter: * SparseGPT one-shot pruning for Transformers * [x] Support for text generation LLMs...

The PR adds support for utilizing HistogramObserver from PyTorch which computes the min/max values for quantization by minimizing quantization error. The implementation has been tested on CodeLlama and Llama-2 models.

Add support for loading Transformers models without specifying task attributes. This is especially useful for exporting models for embedding extraction. This current accessed by `"model"` or `"base"` task - I'm...

Bumps [scikit-learn](https://github.com/scikit-learn/scikit-learn) from 0.24.2 to 1.0.1. Release notes Sourced from scikit-learn's releases. scikit-learn 1.0.1 We're happy to announce the 1.0.1 release with several bugfixes: You can see the changelog here:...

dependencies

**Describe the bug** `RecursionError: maximum recursion depth exceeded while getting the str of an object` **Expected behavior** I want to the convert a LlaMa model into ONNX and then benchmark...

bug

Additional tests to ensure Top-KAST is working as intended. Bugfix: when computing weight decay for the backwards-only weights (set B in the paper), the multiplier should be proportional to 1/(the...

Not ready for prime time, but it does work in making LLM export much more memory efficient and faster