LLM-Pruner
LLM-Pruner copied to clipboard
the new pytorch.bin is bigger than original model issue
When I choose save model, I found some strange things。The new pytorch.bin is bigger than original model。I choose Baichuan-7B ,--pruning_ratio 0.5 for test, and add --save_model for save the model after pruning. but the new pytorch.bin is 17GB, and the original model is only 13GB? Could you please tell me why? Thank you!
Hi @lb553024300, don't worry about the bin size. We store the whole model object into the disk with torch.save(model)
, so the saved file will be larger than torch.save(model.state_dict())
Hi @lb553024300, don't worry about the bin size. We store the whole model object into the disk with
torch.save(model)
, so the saved file will be larger thantorch.save(model.state_dict())
Get it,thanks
Hi. Could you please check if you deleted the gradient for calculating Taylor before saving the model?
Hi @VainF, can you tell me how to convert it back to the same format as the original model? The lighter capacity is what I need for storage. Thank you very much
Hi @lb553024300, don't worry about the bin size. We store the whole model object into the disk with
torch.save(model)
, so the saved file will be larger thantorch.save(model.state_dict())
I tried torch.save(model.state_dict()) but the capacity is still the same, is there any way to save the same as the original model on huggingface? I tried loading the model and saving it but it still didn't reduce the size.
import torch
import argparse
def main(args):
pruned_dict = torch.load(args.ckpt, map_location='cpu')
tokenizer, model = pruned_dict['tokenizer'], pruned_dict['model']
print(f"Model took {round(model.get_memory_footprint() / 1e9, 2)} GB")
# Remove gradient
model.zero_grad()
for name, module in model.named_parameters():
if 'weight' in name:
module.grad = None
print(f"Model took {round(model.get_memory_footprint() / 1e9, 2)} GB") #=> ~ 25GB
# model.half()
# print(f"Model took {round(model.get_memory_footprint() / 1e9, 2)} GB") #=> ~12GB
# Save
model.save_pretrained(args.output_dir) #=> ~ 25GB
tokenizer.save_pretrained(args.output_dir)
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--ckpt", type=str, required=True)
parser.add_argument("--output_dir", type=str, required=True)
args = parser.parse_args()
main(args)
I am working with the Bloom model