Roberto comments

Results 7 comments of


                                            Roberto

The accuracy problem

I have the same problem, I think that we should try not pruning the first 3-5 layers and the last 3-5 layers. I'm trying...

> [@Cyber-Vadok](https://github.com/Cyber-Vadok) I think will have size mismatch problem when load the model after pruning? but we can try! You are right! In LLMPruner there's the ["root_instances" argument](https://github.com/horseee/LLM-Pruner/blob/128a07d977f9b205d60ab14cfbc6a78f8a8e39d2/llama3.py#L114C1-L115C137) and it...

Custom Model pruning

I have the same problem with [Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B)

Qwen2.5 series error when prune

as suggested [here](https://github.com/horseee/LLM-Pruner/issues/93) `Add model.config.use_cache = False`

Qwen2.5 series error when prune

> After adding this `model.config.use_cache = False`, how long does it take to prun the model? Since cache is used to speed up the process, would it cost so many...

tp.utils.count_ops_and_params Error

I think the problem is that after pruning the shape of your model have changed. My guess come from [Modify static attributes or forward functions](https://github.com/VainF/Torch-Pruning/tree/master?tab=readme-ov-file#modify-static-attributes-or-forward-functions) in the readme.

Roberto

关于consecutive_groups

The accuracy problem

The accuracy problem

Custom Model pruning

Qwen2.5 series error when prune

Qwen2.5 series error when prune

tp.utils.count_ops_and_params Error