sparseml issues

recipe.yaml not found

3

Hi, i try your example from the main page: ---- git clone https://github.com/neuralmagic/sparseml pip install -e "sparseml[transformers]" wget https://huggingface.co/neuralmagic/TinyLlama-1.1B-Chat-v0.4-pruned50-quant-ds/raw/main/recipe.yaml sparseml.transformers.text_generation.oneshot --model_name TinyLlama/TinyLlama-1.1B-Chat-v1.0 --dataset_name open_platypus --recipe recipe.yaml --output_dir ./obcq_deployment --precision float16...

botox-100

bug

Refactor Quantization Modifer and Reloading

* Adds new vLLMQuantizationModifier that supports the new framework in compressed-tensors * Adds support for loading a model quantized in the compressed-tensors framework * Testing scripts for comparing performance to...

Satrat

Bump pydantic from 1.7.4 to 1.10.13 in /research/information_retrieval/doc2query

Bumps [pydantic](https://github.com/pydantic/pydantic) from 1.7.4 to 1.10.13. Release notes Sourced from pydantic's releases. V1.10.13 2023-09-27 What's Changed Update pip commands to install 1.10 by @chbndrhnns in pydantic/pydantic#6930 Make the v1 mypy...

dependabot[bot]

dependencies

[WiP] Fixing kv cache injection for LlaMa and Mistral

3

dbogunowicz

[OneShot][Testing] Expand Integration tests to run for llama-7b; add gpu/auto cases

# Summary - Update obcq tests into separate integration tests - Add/Update `test_sparsities --> test_obcq_sparsity.py`; now tests LLama-7b using gpu and "auto" using the same sparsity recipe as TInyStories -...

dsikka

[transformers] Prompt masking

Applying character masks to a prompts in the format `[foo]some text here\n[bar]response here`, to mask characters owned by`[bar]`

horheynm

[Draft] Avoid loading model weights before recipe application if any

1

Peviously when `SparseAutoModelForCausalLM.from_pretrained(...)` was called the weights were loaded in twice, once during `model = super(AutoModelForCausalLM, cls).from_pretrained(...)` and then again after recipe application, which is undesirable. This PR updates the...

rahul-tuli

Bump idna from 2.10 to 3.7 in /research/information_retrieval/doc2query

Bumps [idna](https://github.com/kjd/idna) from 2.10 to 3.7. Release notes Sourced from idna's releases. v3.7 What's Changed Fix issue where specially crafted inputs to encode() could take exceptionally long amount of time...

dependabot[bot]

dependencies

Remove protobuf version restriction to support new versions

mgoin

[Roadmap] SparseML Roadmap Q2 2024

## Short-term Work Issues or PRs that the NM team are planning to tackle for this quarter: * Dependencies * [x] Forkless Transformers https://github.com/neuralmagic/sparseml/pull/2199 * [x] Upgrade Transformers to latest...

mgoin

sparseml
sparseml copied to clipboard

Metadata

recipe.yaml not found

Refactor Quantization Modifer and Reloading

Bump pydantic from 1.7.4 to 1.10.13 in /research/information_retrieval/doc2query

[WiP] Fixing kv cache injection for LlaMa and Mistral

[OneShot][Testing] Expand Integration tests to run for llama-7b; add gpu/auto cases

[transformers] Prompt masking

[Draft] Avoid loading model weights before recipe application if any

Bump idna from 2.10 to 3.7 in /research/information_retrieval/doc2query

Remove protobuf version restriction to support new versions

[Roadmap] SparseML Roadmap Q2 2024

← Metadata

Owner

Metadata

sparseml sparseml copied to clipboard

Metadata

← Metadata

Owner

Metadata

sparseml
sparseml copied to clipboard