LLM-Pruner
LLM-Pruner copied to clipboard
[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.
Hi,does it support qwen2?
if it has, how can we convert? wait for your reply. thanks
How to fix this error for custom model pruning? The pruning was done but some layers weren't pruned properly. I got this error at the time of inference ` key_states...
Hi! While loading the pruned model (using this llm pruner) how do we load the model which will result in equivalent loading like AutoModelforCausalLlm as in hugging face transformers?