Pruned model is same size as original

Open virentakia opened this issue 2 years ago • 2 comments

Great work on the project, really excited to see the outcomes.

However, After running the script below, the pruned model (output) seems to be of the same size as the original one (which is 6.38G)

!python /content/wanda/main.py
--model openlm-research/open_llama_3b_v2
--prune_method wanda
--sparsity_ratio 0.5
--sparsity_type unstructured
--save_model out/pruned
--save out/open_llama_3b_v2/unstructured/wanda/

Is this correct, or am I missing something?!

Dec 05 '23 02:12 virentakia

Yes, this is correct and it has been true for unstructured pruning. To my understanding, unstructured sparsity won't save the memory footprint on modern GPU devices.

Dec 05 '23 14:12 Eric-mingjie

Thanks, any pruning options for reducing memory footprint?!

Dec 16 '23 12:12 virentakia