run 70b error:RuntimeError: shape '[1, 4096, 64, 128]' is invalid for input of size 4194304
After setting up the environment as instructed, I successfully pruned llama2-7b using Wanda without any issues. However, when attempting to prune llama2-70b, the following error occurred:
Traceback (most recent call last):
File "/home/jovyan/projects/BYD/wanda-main/main.py", line 110, in
--model ../weights/llama-2-70b-hf
--prune_method wanda
--sparsity_ratio 0.5
--sparsity_type unstructured
--save ../weights/wanda/
--save_model ../weights/wanda_70b/
Could you please help me understand why this error occurred? Do I need to upgrade the environment, especially the transformer library? Your assistance is appreciated.
After upgrading the transformers library, the previous error disappeared, but a new error has surfaced: RuntimeError: CUDA out of memory. Tried to allocate 4.00 GiB (GPU 0; 39.39 GiB total capacity; 34.73 GiB already allocated; 2.60 GiB free; 34.76 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Same error even after upgrading the transformers library.