ChatGLM2-6B icon indicating copy to clipboard operation
ChatGLM2-6B copied to clipboard

全参数精调 [launch.py:315:sigkill_handler] Killing subprocess

Open jakeywu opened this issue 2 years ago • 4 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Current Behavior

image

7张24G的显卡 image

Expected Behavior

No response

Steps To Reproduce

image

启动脚本: image

Environment

- OS:Ubuntu 18.04.6 LTS
- Python:3.10.9
- Transformers:4.30.2
- PyTorch:2.0.0
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) : True

Anything else?

No response

jakeywu avatar Aug 09 '23 04:08 jakeywu

你是CUDA out of memory,我把--fp16 改成: --pre_seq_len 128
--quantization_bit 4 就可以了,但是感觉没有分布式加速,指定了4张gpu,每张卡的现存占用都是一样的,感觉每个gpu都在重复计算,并没有加速

feipengheart avatar Aug 09 '23 06:08 feipengheart

Is there an existing issue for this?

  • [x] I have searched the existing issues

Current Behavior

image 7张24G的显卡 ![image](https://user-images.githubusercontent.com/11456239/259295223-40f1a478-f807-42ce-8b42-f5ac25632223.png)

Expected Behavior

No response

Steps To Reproduce

image 启动脚本: ![image](https://user-images.githubusercontent.com/11456239/259295549-087de0c7-7336-4500-b219-0320cc467f0e.png)

Environment

- OS:Ubuntu 18.04.6 LTS
- Python:3.10.9
- Transformers:4.30.2
- PyTorch:2.0.0
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) : True

Anything else?

No response

你好,请问这个问题有解决吗?

xlhuang132 avatar Aug 09 '23 09:08 xlhuang132

没有搞定 @xlhuang132

jakeywu avatar Oct 10 '23 10:10 jakeywu

全参数咋调的,求助

guanslai avatar Jan 29 '24 09:01 guanslai