RagingPandas comments

Repositories
Issues
Comments

Results 2 comments of


                                            RagingPandas

Help understanding GPU memory spikes during GRPO training.

> The spike is caused by cross entropy and entropy computation in backward. Thank you for the reply. Do you have any suggestions on what parameters I have set that...

Help understanding GPU memory spikes during GRPO training.

I see, thanks for your reply. I halved the `ppo_max_token_len_per_gpu` (to 17k) and the spikes reduced down to 91% of GPU memory. Given vllm rollout memory isn't a constraint I'm...