nazarenodefrancesc

Results 1 issues of nazarenodefrancesc

DAPO seems a promising RL algorithm over GRPO. Are you planning to implement it? Thank you https://github.com/BytedTsinghua-SIA/DAPO

feature request