Ma Xinyin
Ma Xinyin
HI. We have updated the code in https://github.com/horseee/LLM-Pruner. Please refer to the new repo-v-
Our code currently does not support iterative_steps > 1 for baichuan. Please try iterative_steps = 1.
Hi. LLM-Pruner is a general structural pruning method for LLM and it can also be used on the pruning of BLOOM. However, due to the increasing number of LLMs in...
Hi. I uploaded the code for pruning BLOOM. You can find the instructions for pruning the BLOOM [here](https://github.com/horseee/LLM-Pruner/tree/main/examples#cherry_blossom-bloom) I only conducted a quick test on BLOOM-3B to make sure it...
Hi. From the perspective of the algorithm, it is entirely feasible to use this on Bloom 176B. However, the current algorithm requires gradient computation, and recording these gradients for a...
Hi. Please refer to this line: ``` RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! ``` Please check whether...
Hi. You need to first configure the wandb. You can follow the instructions of wandb😄
Hi. Could you please check if you deleted the gradient for calculating Taylor before saving the model?
Hi. The pruning needs around 80G memory if you use the Taylor pruner, since it needs to compute the gradient of the model. If you use other pruners, like L2...
Hi. Can I check which LLaMa-7B checkpoint you use? `decapoda-research/llama-7b-hf` in my code is not available currently and I'm not sure if it is the reason that causes this difference.