Qwen icon indicating copy to clipboard operation
Qwen copied to clipboard

[BUG] 奇怪的显存问题

Open vcvcvnvcvcvn opened this issue 1 year ago • 3 comments

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

  • [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

  • [X] 我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

无论是用1.8B还是7Blora微调的时候,显存总是占满,我的是V100,32G的显存 1.8B: torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 22.00 MiB (GPU 7; 31.74 GiB total capacity; 3.22 GiB already allocated; 15.38 MiB free; 3.27 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF 7B: CUDA out of memory. Tried to allocate 96.00 MiB (GPU 3; 31.74 GiB total capacity; 9.44 GiB already allocated; 11.38 MiB free; 9.46 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

期望行为 | Expected Behavior

No response

复现方法 | Steps To Reproduce

No response

运行环境 | Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):

备注 | Anything else?

No response

vcvcvnvcvcvn avatar Dec 02 '23 13:12 vcvcvnvcvcvn

我在使用7B作qlora微调时,也遇到了相同的问题 Uploading image.png…

Miamas777 avatar Dec 04 '23 12:12 Miamas777

两位可以提供下transformers和accelerate的版本吗~ 请问是否是在模型加载过程中就已经显存占用超出预期了,还是微调中出现的这个问题呢~

jklj077 avatar Dec 05 '23 13:12 jklj077

image 我也遇到一样的问题

wenHK avatar Dec 15 '23 03:12 wenHK

Due to lack of information, the following is my best guesses:

  1. The original poster's issue appears to be unrelated to the finetuning process. The GPU memory seems to be taken by other processes.
  2. The second poster's could be different from the original, since QLoRA is adopted. The screenshot is unavailable now, we don't know what have happens.
  3. The last poster's issue is also different from the previous ones, since V100 16GB is used. 16GB cannot run Qwen-7B with LoRA.

For GPU memory usage, please take a look at the test here: https://github.com/QwenLM/Qwen/tree/main/recipes/finetune/deepspeed

jklj077 avatar Apr 02 '24 05:04 jklj077