LLM-Pruner issues

Pruning MQA?

How to pruning LLMs with Multi Query Attention?

I run LLM pruner with the command specified in the ReadMe to prune LLama-7B ```bash python hf_prune.py --pruning_ratio 0.25 \ --block_wise \ --block_mlp_layer_start 4 --block_mlp_layer_end 30 \ --block_attention_layer_start 4 --block_attention_layer_end...

grigorn

this method can be used for bloom?

5

18140663659

enhancement

Examples on the Huggingface Hub

Apologies if this has been asked before, but do you have pruned models that we can test and run locally? Anything on the huggingface hub? I'd like to test some...

vgoklani

When will you support ChatGLM?

AboveParadise

Force even pruning across layers

1

Is there a way to force the pruning to remove the same amount of parameters from all layers? This would make the resulting model compatible with hf implementation (use from_pretrained)

thedarkzeno

param_first and param_mix result the same ppl

1

I simply use the following commands to run: `python hf_prune.py --pruning_ratio 0.62785 --block_wise --block_mlp_layer_start 0 --block_mlp_layer_end 32 --block_attention_layer_start 32 --block_attention_layer_end 32 --pruner_type taylor --base_model /mnt/petrelfs/xxx/llama2-7b --device cpu --eval_device cuda --taylor...

Kausal-Lei

I encountered the following error message when I assign iterative_steps = 2 during baichuan-7B pruning

1

Traceback (most recent call last): File "/home/jovyan/honor/yangdong/LLM-Pruner-main/examples/baichuan.py", line 342, in main(args) File "/home/jovyan/honor/yangdong/LLM-Pruner-main/examples/baichuan.py", line 229, in main pruner.step() File "/home/jovyan/honor/yangdong/LLM-Pruner-main/LLMPruner/torch_pruning/pruner/algorithms/metapruner.py", line 186, in step for group in self.prune_local(): File "/home/jovyan/honor/yangdong/LLM-Pruner-main/LLMPruner/torch_pruning/pruner/algorithms/metapruner.py",...

yangd85

bug

evaluate

1

Hello, After pruning, use [alpaca_data_zh_51k](https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/data/alpaca_data_zh_51k.json) Data set fine-tuning, How to evaluate the performance of the model on the alpaca_data_zh_51k after fine-tuning? Thanks.

StevensPrime

LLM-Pruner
LLM-Pruner copied to clipboard

Metadata

Pruning MQA?

为什么num_examples默认是10？

Reproducing paper results

this method can be used for bloom?

Examples on the Huggingface Hub

When will you support ChatGLM?

Force even pruning across layers

param_first and param_mix result the same ppl

I encountered the following error message when I assign iterative_steps = 2 during baichuan-7B pruning

evaluate

← Metadata

Owner

Metadata

LLM-Pruner LLM-Pruner copied to clipboard

Metadata

← Metadata

Owner

Metadata

LLM-Pruner
LLM-Pruner copied to clipboard