Liger-Kernel Support DAPO Chunked loss

Support DAPO Chunked loss

Open qingquansong opened this issue 8 months ago • 2 comments

trafficstars

🚀 The feature, motivation and pitch

ByteDance DAPO is the open-sourced SOTA RL algorithm that achieves 50 points on AIME 2024 based on the Qwen2.5-32B pre-trained model, surpassing the previous SOTA achieved by DeepSeek's GRPO. Original DAPO code is publicly available now.

Alternatives

No response

Additional context

No response

Mar 21 '25 08:03 qingquansong

Hi! I'd like to take on this issue.

Mar 21 '25 08:03 srzhu97

Hi! I'd like to take on this issue.

🚀 Thanks！Assigned.

Mar 21 '25 08:03 qingquansong

Liger-Kernel Liger-Kernel copied to clipboard

Support DAPO Chunked loss

🚀 The feature, motivation and pitch

Alternatives

Additional context

Liger-Kernel
Liger-Kernel copied to clipboard