Prerequisites

Before submitting your question, please ensure the following:

[x] I am running the latest version of PowerInfer. Development is rapid, and as of now, there are no tagged versions.
[x] I have carefully read and followed the instructions in the README.md.
[x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).

Question Details

Please provide a clear and concise description of your question. If applicable, include steps to reproduce the issue or behaviors you've observed.

Additional Context

Please provide any additional information that may be relevant to your question, such as specific system configurations, environment details, or any other context that could be helpful in addressing your inquiry.

Jun 23 '24 08:06 Graysonicc

python convert.py --outfile ./llama_convert/prosparse-llama-2-13b.powerinfer.gguf ./prosparse-llama-2-13b ./prosparse-llama-2-13b-predictor Model architecture True is not supported by this convert.py. Trying with convert-hf-to-powerinfer-gguf.py... Loading model: prosparse-llama-2-13b Traceback (most recent call last): File "/root/autodl-tmp/powerinfer/PowerInfer/convert-hf-to-powerinfer-gguf.py", line 609, in model_class = Model.from_model_architecture(hparams["architectures"][0]) File "/root/autodl-tmp/powerinfer/PowerInfer/convert-hf-to-powerinfer-gguf.py", line 189, in from_model_architecture raise NotImplementedError(f'Architecture "{model_architecture}" not supported!') NotImplementedError: Architecture "SparseLlamaForCausalLM" not supported!

Jun 23 '24 08:06 Graysonicc

Uploading @YSU)%~3F0F[]DI%4LW_COM.png…

Jun 23 '24 08:06 Graysonicc

Hello. Have you solve this bug ?

Jun 28 '24 14:06 baibizhe

try to replace the config.json file in your /path/to/model with the vllm version detail info: https://huggingface.co/SparseLLM/prosparse-llama-2-13b

_Here are the steps to adapting the original vLLM to ProSparse models.

Replace the file vllm/model_executor/models/llama.py in original vLLM with this file. Replace the contents of the original config.json with this file. Set the environment variable ACT_INFO. To test the version without activation threshold shifting, export ACT_INFO=relu. To test the version with activation threshold shifting, export ACT_INFO=fatrelu_0.01._

the vllm version of config.json set the model_architecture as "LlamaForCausalLM", which is supported by convert-hf-to-powerinfer-gguf.py

Apr 02 '25 03:04 ganminghao

How to convert ProSparse-LLaMA-2-13B model to .gguf?

Prerequisites

Question Details

Additional Context