pyllama
pyllama copied to clipboard
"torch.cuda.OutOfMemoryError: CUDA out of memory" when I'm *not* out of memory
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 12.00 GiB total capacity; 2.60 GiB already allocated; 8.36 GiB free; 2.62 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
woaah. I'm out of memory already? 8.36 GiB free and I can't allocate 64.00 MiB?
What is the command you ran?
What is the command you ran?
python -m llama.llama_quant .\llama-7b-hf --wbits 8 --save .\llama-7b-hf\pyllama-7B8b.pt c4
I got the same thing