LLM-Pruner icon indicating copy to clipboard operation
LLM-Pruner copied to clipboard

[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.

Results 54 LLM-Pruner issues
Sort by recently updated
recently updated
newest added

How to pruning LLMs with Multi Query Attention?

I run LLM pruner with the command specified in the ReadMe to prune LLama-7B ```bash python hf_prune.py --pruning_ratio 0.25 \ --block_wise \ --block_mlp_layer_start 4 --block_mlp_layer_end 30 \ --block_attention_layer_start 4 --block_attention_layer_end...

Apologies if this has been asked before, but do you have pruned models that we can test and run locally? Anything on the huggingface hub? I'd like to test some...

Is there a way to force the pruning to remove the same amount of parameters from all layers? This would make the resulting model compatible with hf implementation (use from_pretrained)

I simply use the following commands to run: `python hf_prune.py --pruning_ratio 0.62785 --block_wise --block_mlp_layer_start 0 --block_mlp_layer_end 32 --block_attention_layer_start 32 --block_attention_layer_end 32 --pruner_type taylor --base_model /mnt/petrelfs/xxx/llama2-7b --device cpu --eval_device cuda --taylor...

Traceback (most recent call last): File "/home/jovyan/honor/yangdong/LLM-Pruner-main/examples/baichuan.py", line 342, in main(args) File "/home/jovyan/honor/yangdong/LLM-Pruner-main/examples/baichuan.py", line 229, in main pruner.step() File "/home/jovyan/honor/yangdong/LLM-Pruner-main/LLMPruner/torch_pruning/pruner/algorithms/metapruner.py", line 186, in step for group in self.prune_local(): File "/home/jovyan/honor/yangdong/LLM-Pruner-main/LLMPruner/torch_pruning/pruner/algorithms/metapruner.py",...

bug

Hello, After pruning, use [alpaca_data_zh_51k](https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/data/alpaca_data_zh_51k.json) Data set fine-tuning, How to evaluate the performance of the model on the alpaca_data_zh_51k after fine-tuning? Thanks.