ChatGLM2-6B icon indicating copy to clipboard operation
ChatGLM2-6B copied to clipboard

[BUG/Help] 微调之后模型部署,设置了多卡,但还是只用第0卡,显示内存不足

Open SebastianHan opened this issue 2 years ago • 2 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Current Behavior

web_demo.shPRE_SEQ_LEN=128

CUDA_VISIBLE_DEVICES=5,6,7 python3 web_demo.py
--model_name_or_path /root/ChatGLM2-6B/chatglm2-6b
--ptuning_checkpoint output/adgen-chatglm2-6b-pt-128-2e-2/checkpoint-3000
--pre_seq_len $PRE_SEQ_LEN ` GPU 多卡情况 image

Expected Behavior

No response

Steps To Reproduce

报错:`torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 32.00 MiB (GPU 0; 10.92 GiB total capacity; 10.44 GiB already allocated; 19.25 MiB free; 10.46 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Environment

- OS:
- Python:3.8
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

Anything else?

No response

SebastianHan avatar Aug 24 '23 07:08 SebastianHan

怎么才能在微调后,模型部署运行web_demo.sh时,调用多卡成功?

SebastianHan avatar Aug 24 '23 07:08 SebastianHan

请问你解决这个问题了吗?

2811668688 avatar Mar 14 '24 14:03 2811668688