sparseml
sparseml copied to clipboard
Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
Hi, i try your example from the main page: ---- git clone https://github.com/neuralmagic/sparseml pip install -e "sparseml[transformers]" wget https://huggingface.co/neuralmagic/TinyLlama-1.1B-Chat-v0.4-pruned50-quant-ds/raw/main/recipe.yaml sparseml.transformers.text_generation.oneshot --model_name TinyLlama/TinyLlama-1.1B-Chat-v1.0 --dataset_name open_platypus --recipe recipe.yaml --output_dir ./obcq_deployment --precision float16...
* Adds new vLLMQuantizationModifier that supports the new framework in compressed-tensors * Adds support for loading a model quantized in the compressed-tensors framework * Testing scripts for comparing performance to...
Bumps [pydantic](https://github.com/pydantic/pydantic) from 1.7.4 to 1.10.13. Release notes Sourced from pydantic's releases. V1.10.13 2023-09-27 What's Changed Update pip commands to install 1.10 by @chbndrhnns in pydantic/pydantic#6930 Make the v1 mypy...
# Summary - Update obcq tests into separate integration tests - Add/Update `test_sparsities --> test_obcq_sparsity.py`; now tests LLama-7b using gpu and "auto" using the same sparsity recipe as TInyStories -...
Applying character masks to a prompts in the format `[foo]some text here\n[bar]response here`, to mask characters owned by`[bar]`
Peviously when `SparseAutoModelForCausalLM.from_pretrained(...)` was called the weights were loaded in twice, once during `model = super(AutoModelForCausalLM, cls).from_pretrained(...)` and then again after recipe application, which is undesirable. This PR updates the...
Bumps [idna](https://github.com/kjd/idna) from 2.10 to 3.7. Release notes Sourced from idna's releases. v3.7 What's Changed Fix issue where specially crafted inputs to encode() could take exceptionally long amount of time...
## Short-term Work Issues or PRs that the NM team are planning to tackle for this quarter: * Dependencies * [x] Forkless Transformers https://github.com/neuralmagic/sparseml/pull/2199 * [x] Upgrade Transformers to latest...