LLM-Pruner
LLM-Pruner copied to clipboard
[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.
when I start recover training Baichuan-7B, I meet the bug. Exception has occurred: RuntimeError Caught RuntimeError in replica 1 on device 1. Original Traceback (most recent call last): File "/opt/miniconda3/envs/flash/lib/python3.9/site-packages/torch/nn/parallel/parallel_apply.py",...
Thanks a lot for your work on compression on LLMs, and looking forward for the code for ChatGLM. When would it be available for GLMs?
There are no random seed settings in post_training.py. Did the results in the paper use a random seed setting? I look forward to your reply. Thank you very much!
Thank you for your solid work. I would like to ask if the current version is suitable for GQA architecture models, such as LLaMA-2-70B and LLaMA-3.
Added Model on CUDA.
I would like to ask if the current version is suitable for qwen.
Hi, I run the code successfully but find that there does not exist the pytorch_model.bin in the tune_log/llama_0.2/checkpoint-200 folder. May I ask the possible reasons or have you encountered this...