tinker-cookbook
tinker-cookbook copied to clipboard
Make PPO loss hyper-parameters available to users
Thnak you very much for this excellent work.
Here I want to ask whether it is possible to give us some choice to control the hyper-parameters of PPO loss, like clip_ratio_high and clip_rario_low, which is very important to control the exploration-exploitation of RL.
The Tinker documentation shows that we can implement the custom loss by ourselves. However, the tranining time will becomes x3, which is too much.
Looking into it ...
When might this be added?
Looking forward to this