tinker-cookbook icon indicating copy to clipboard operation
tinker-cookbook copied to clipboard

Make PPO loss hyper-parameters available to users

Open BaohaoLiao opened this issue 2 months ago • 3 comments

Thnak you very much for this excellent work.

Here I want to ask whether it is possible to give us some choice to control the hyper-parameters of PPO loss, like clip_ratio_high and clip_rario_low, which is very important to control the exploration-exploitation of RL.

The Tinker documentation shows that we can implement the custom loss by ourselves. However, the tranining time will becomes x3, which is too much.

BaohaoLiao avatar Oct 11 '25 17:10 BaohaoLiao

Looking into it ...

Tiiiger avatar Oct 12 '25 06:10 Tiiiger

When might this be added?

JasonWei05 avatar Nov 11 '25 00:11 JasonWei05

Looking forward to this

QingyuanWuNothing avatar Nov 21 '25 17:11 QingyuanWuNothing