aaaaammmmm
aaaaammmmm
Reproduced the issue. In OpenRLHF 0.5.7, after upgrading deepspeed from 0.15.0 to 0.16.0, my ppo demo encounters OOM ``` File "/root/miniconda3/lib/python3.10/site-packages/openrlhf/utils/deepspeed/deepspeed.py", line 129, in backward model.backward(loss) File "/root/miniconda3/lib/python3.10/site-packages/deepspeed/utils/nvtx.py", line 18,...
This commit https://github.com/deepspeedai/DeepSpeed/commit/cd20a3bbc7713908d7fb5fd7af4a91d52f126370 introduced the issue  Open reasoner zero identified the same issue. https://github.com/Open-Reasoner-Zero/Open-Reasoner-Zero/issues/13
I found that this issue occurs because the `ds_grads_remaining` cannot correctly count for the forward pass when using `gradient checkpointing`.  Below is the log for enable gradient checkpointing.: 
 same problem I guess you cloud add model_dtype in config to solve it