philharmonikerzzy

Results 2 issues of philharmonikerzzy

Hi, using the current implementation of the PPO using the `PPOTrainer`, im seeing that the entropy of the actively updated model continues to increase as the training proceeds. It seems...

Do we have to push the model to hugginface to be able to load a model trained locally?