typewriter PolicyOptimization Agents do not log signals to csv

PolicyOptimization Agents do not log signals to csv

Open crzdg opened this issue 4 years ago • 1 comments

I encountered a strange behavior.

For ClippedPPO, PPO and ActorCritic I was not able to get the signals defined in there init-Method. Loss, Gradients, Likelihood, KL Divergence, etc...

I'm not sure if it is a issue in my Environment implementation. But DQN logs its signals. I also checked the signals dumpy by update_log. For the mentiond agents, the self.episode_signals includes duplicate entries for the signals not logged. As the signals are defined on several inheritents of the agent-class but still saved to self.episode_signals multiple times. Obviously only the latest created will be updated with values in the train-Method.

Also, It could be to the behavior of updating signals before every episode. As gradients are only available after training they might get reseted after last training iteration as a new episode starts.

However, I do have experiments with ClippedPPO where those signals were logged, but I can't recreate this.

Any suggestions?

May 25 '20 23:05 crzdg

typewriter typewriter copied to clipboard

PolicyOptimization Agents do not log signals to csv

typewriter
typewriter copied to clipboard