stable-baselines
stable-baselines copied to clipboard
PPO2: Call total_episode_reward_logger before incrementing num_timesteps
Description
In ppo2.py total_episode_reward_logger should be called before incrementing num_timesteps. See explanation in #143.
Motivation and Context
It improves the reported episode reward logging issue, and a PR was asked.
- [x] I have raised an issue to propose this change (required for new features and bug fixes)
Types of changes
- [x] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] Documentation (update in the documentation)
Checklist:
- [x] I've read the CONTRIBUTION guide (required)
- [x] I have updated the changelog accordingly (required).
- [ ] My change requires a change to the documentation.
- [ ] I have updated the tests accordingly (required for a bug fix or a new feature).
- [ ] I have updated the documentation accordingly.
Hello,
Please fill in the PR template completely.