sparseml
sparseml copied to clipboard
Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
1. Add zoo_model.unzip() which unzips everything. (Users can still call unzip per sub directory if they want) 2. add `unzip=False` to download params so you can call `zoo_model.download(unzip=True)` _Originally posted...
## Summary This pull request addresses the removal of the dependency on `nm-transformers` in favor of the original HF `transformers.` The primary motivation for this change is to simplify maintenance...
## Summary Bug fix for running custom dataset using accerate and use preprocessing func in module `src/sparseml/transformers/utils/preprocesing_funtions.py`, specified by string arg in parser. ```bash #!/bin/bash export CUDA_VISIBLE_DEVICES="0,1,2,3" NPROC=$(($(echo $CUDA_VISIBLE_DEVICES |...
# Summary - Add a workflow which based on the conditions, will create a nightly, dev or release container - So far, the workflows rely on a cuda-based Python 3.8...
**Describe the bug** I train a ViT which has an intermediary output, which is then sent back into the network to modulate the activations as in a feedback loop. Unfortunately,...
**Describe the bug** Error converting mistral to onnx **Expected behavior** ``` !pip install virtualenv !virtualenv myenv !source /content/myenv/bin/activate !git clone https://github.com/neuralmagic/sparseml #!pip install sparseml !pip install -e "sparseml[transformers]" #!pip uninstall...
I tried the same Methode described in : https://github.com/neuralmagic/sparseml/blob/main/integrations/torchvision/tutorials/docs-torchvision-sparsify-from-scratch-resnet50-beans.ipynb with the same recipe files and here are my results: *dense Val Loss: 0.03551835000926499 Top 1 Acc: 0.9824561403508771 model size: 89.895MB...
## Feature Description The results of my experimentation with the `tiny_starcoder` model. ## Findings: - the original KV cache is being added not as separate arrays: `past_key_values.{attn_block_id}.values` and `past_key_values.{attn_block_id}.keys`, but...
Current RigL code uses the param scorer to enforce custom sparsity patterns (Erdos-Renyi, ERK) on the masks. This PR instead passes the ER/ERK sparsity targets directly to the mask creator,...