ChatGLM-6B torch.cuda.OutOfMemoryError

Is your feature request related to a problem? Please describe.

1

Solutions

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 128.00 MiB (GPU 0; 10.90 GiB total capacity; 10.62 GiB already allocated; 16.81 MiB free; 10.62 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF 我在执行python web_demo.py时遇到这种情况，我应该怎么办，我的gpu足够的才对，请问我应该如何更改配置？

Additional context

1

May 26 '23 02:05 gaofeng36599

我昨天也遇到了。就是GPU显存不够用，你输入nvidia-smi看看显存资源是不是被其他的程序占用了。

May 26 '23 02:05 runzhi214

没有，在不运行的时候显存占用只有111MiB / 11264MiB，我看readme.md使用INT4 量化后的模型仅需大概 5.2GB，为什么我10G会提示不够

May 26 '23 02:05 gaofeng36599

我用的16G显存运行多轮后也会不够 torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 646.00 MiB (GPU 0; 14.76 GiB total capacity; 12.35 GiB already allocated; 529.75 MiB free; 13.41 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Jun 01 '23 08:06 sujunze

相同代码，我之前训练可以跑起来。最近全参调整也碰到了这个问题，非常奇怪。量化到4 和8是可以训练起来的。但是这个问题怎么解，有没有高手来解答一下

Jun 13 '23 02:06 hejianls

+1，有遇到同样的问题，p-tuning的时候通过这个方法解决：export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512。但是全参finetune的时候这个参数改了多次也没有作用。

Jul 01 '23 03:07 johnnywuj81

ChatGLM-6B ChatGLM-6B copied to clipboard

torch.cuda.OutOfMemoryError

Is your feature request related to a problem? Please describe.

Solutions

Additional context

ChatGLM-6B
ChatGLM-6B copied to clipboard