LLM-Pruner the new pytorch.bin is bigger than original model issue

When I choose save model, I found some strange things。The new pytorch.bin is bigger than original model。I choose Baichuan-7B ,--pruning_ratio 0.5 for test, and add --save_model for save the model after pruning. but the new pytorch.bin is 17GB, and the original model is only 13GB? Could you please tell me why? Thank you!

Nov 18 '23 16:11 lb553024300

Hi @lb553024300, don't worry about the bin size. We store the whole model object into the disk with torch.save(model), so the saved file will be larger than torch.save(model.state_dict())

Nov 18 '23 17:11 VainF

Hi @lb553024300, don't worry about the bin size. We store the whole model object into the disk with torch.save(model), so the saved file will be larger than torch.save(model.state_dict())

Get it,thanks

Nov 19 '23 02:11 lb553024300

Hi. Could you please check if you deleted the gradient for calculating Taylor before saving the model?

Nov 19 '23 04:11 horseee

Hi @VainF, can you tell me how to convert it back to the same format as the original model? The lighter capacity is what I need for storage. Thank you very much

Hi @lb553024300, don't worry about the bin size. We store the whole model object into the disk with torch.save(model), so the saved file will be larger than torch.save(model.state_dict())

I tried torch.save(model.state_dict()) but the capacity is still the same, is there any way to save the same as the original model on huggingface? I tried loading the model and saving it but it still didn't reduce the size.

import torch
import argparse

def main(args):
    pruned_dict = torch.load(args.ckpt, map_location='cpu')
    tokenizer, model = pruned_dict['tokenizer'], pruned_dict['model']

    print(f"Model took {round(model.get_memory_footprint() / 1e9, 2)} GB")

    # Remove gradient
    model.zero_grad()
    for name, module in model.named_parameters():
        if 'weight' in name:
            module.grad = None

    print(f"Model took {round(model.get_memory_footprint() / 1e9, 2)} GB") #=> ~ 25GB

    # model.half()
    # print(f"Model took {round(model.get_memory_footprint() / 1e9, 2)} GB") #=> ~12GB

    # Save
    model.save_pretrained(args.output_dir) #=> ~ 25GB
    tokenizer.save_pretrained(args.output_dir)

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--ckpt", type=str, required=True)
    parser.add_argument("--output_dir", type=str, required=True)
    args = parser.parse_args()
    main(args)

I am working with the Bloom model

Dec 04 '23 15:12 trinhdoduyhungss

LLM-Pruner LLM-Pruner copied to clipboard

the new pytorch.bin is bigger than original model issue

LLM-Pruner
LLM-Pruner copied to clipboard