sparseml icon indicating copy to clipboard operation
sparseml copied to clipboard

Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

Results 165 sparseml issues
Sort by recently updated
recently updated
newest added

### Motivation If we run SparseGPT on a base model at some sparsity, the sparsity mask after SparseGPT could be very different from the initial one. In other words, SparseGPT...

Conversion script: ```python from sparseml.transformers.utils.vllm_export_helpers import export_vllm_checkpoint from sparseml.transformers import SparseAutoModelForCausalLM, SparseAutoTokenizer path = "/home/rahul/projects/sparseml/local/local_output/sparsegpt-autogptq-emulation-checkpoint/stage_compression" sparse_gpt_model = SparseAutoModelForCausalLM.from_pretrained(path) tokenizer = SparseAutoTokenizer.from_pretrained(path) export_vllm_checkpoint( model=sparse_gpt_model, tokenizer=tokenizer, ) ``` ```bash 024-03-21 01:58:33 sparseml.pytorch.model_load.helpers...

Ultrachat200k has 2 splits for training, one for sft and another for dpo. As a result it doesn't have a "train" split per se. This PR allows for a train_sft...

There's no need of a period between the question and the line break since the question will contain its own punctuation (normally interrogation mark). The period also doesn't match the...

Previously to create, retrieve, or reset a session we needed to import `import sparseml.core.session as session_manager`. This file is now a top level import so instead the functions can be...

# Quantization Modifier UX Update ## Description This PR refactors the quantization modifiers to enhance user experience and simplify the system architecture. It is based off of changes from ~the...

Requires this compressed-tensors branch: https://github.com/neuralmagic/compressed-tensors/pull/45 * Adds support for saving compressed quantized models within SparseAutoModel saving. Compression type can be passed in via `quantization_format` or inferred from the model itself...

Note: this branch requires this PR: https://github.com/neuralmagic/compressed-tensors/pull/46 to land in `compressed-tensors`. ## Example Use: ```python from sparseml.transformers import SparseAutoModelForCausalLM, SparseAutoTokenizer, oneshot import os import torch model_name = "Isotonic/TinyMixtral-4x248M-MoE" model =...

vLLM now requires torch 2.3.0, so we should try to raise the restriction in SparseML. Going forward, I think we shouldn't be so restrictive to newer versions of pytorch and...

scales and zero points were not accounting for correct groups when iterating over the input channel dimension