sparseml issues

Update SparseGPT updates to respect base model's sparsity

### Motivation If we run SparseGPT on a base model at some sparsity, the sparsity mask after SparseGPT could be very different from the initial one. In other words, SparseGPT...

abhinavnmagic

Match GPTQ state dict

Conversion script: ```python from sparseml.transformers.utils.vllm_export_helpers import export_vllm_checkpoint from sparseml.transformers import SparseAutoModelForCausalLM, SparseAutoTokenizer path = "/home/rahul/projects/sparseml/local/local_output/sparsegpt-autogptq-emulation-checkpoint/stage_compression" sparse_gpt_model = SparseAutoModelForCausalLM.from_pretrained(path) tokenizer = SparseAutoTokenizer.from_pretrained(path) export_vllm_checkpoint( model=sparse_gpt_model, tokenizer=tokenizer, ) ``` ```bash 024-03-21 01:58:33 sparseml.pytorch.model_load.helpers...

rahul-tuli

Updates to enable ultrachat200k

Ultrachat200k has 2 splits for training, one for sft and another for dpo. As a result it doesn't have a "train" split per se. This PR allows for a train_sft...

anmarques

Fix GSM template

There's no need of a period between the question and the line break since the question will contain its own punctuation (normally interrogation mark). The period also doesn't match the...

anmarques

Move Session Management to Top Level

1

Previously to create, retrieve, or reset a session we needed to import `import sparseml.core.session as session_manager`. This file is now a top level import so instead the functions can be...

Satrat

[Feature Branch] Quant modifier UX

# Quantization Modifier UX Update ## Description This PR refactors the quantization modifiers to enhance user experience and simplify the system architecture. It is based off of changes from ~the...

rahul-tuli

Quantization Compressor Support

6

Requires this compressed-tensors branch: https://github.com/neuralmagic/compressed-tensors/pull/45 * Adds support for saving compressed quantized models within SparseAutoModel saving. Compression type can be passed in via `quantization_format` or inferred from the model itself...

Satrat

[MOE Quantization] Warn against "undercalibrated" modules

Note: this branch requires this PR: https://github.com/neuralmagic/compressed-tensors/pull/46 to land in `compressed-tensors`. ## Example Use: ```python from sparseml.transformers import SparseAutoModelForCausalLM, SparseAutoTokenizer, oneshot import os import torch model_name = "Isotonic/TinyMixtral-4x248M-MoE" model =...

dbogunowicz

Allow torch 2.3 and remove torch ceiling version restriction

2

vLLM now requires torch 2.3.0, so we should try to raise the restriction in SparseML. Going forward, I think we shouldn't be so restrictive to newer versions of pytorch and...

mgoin

fix group size support for sgpt_wrapper

scales and zero points were not accounting for correct groups when iterating over the input channel dimension

bfineran

sparseml
sparseml copied to clipboard

Metadata

Update SparseGPT updates to respect base model's sparsity

Match GPTQ state dict

Updates to enable ultrachat200k

Fix GSM template

Move Session Management to Top Level

[Feature Branch] Quant modifier UX

Quantization Compressor Support

[MOE Quantization] Warn against "undercalibrated" modules

Allow torch 2.3 and remove torch ceiling version restriction

fix group size support for sgpt_wrapper

← Metadata

Owner

Metadata

sparseml sparseml copied to clipboard

Metadata

← Metadata

Owner

Metadata

sparseml
sparseml copied to clipboard