stable-baselines3
stable-baselines3 copied to clipboard
[Bug]: Iteration not updated in locals while learning
🐛 Bug
In the method stable_baselines3.common.on_policy_algorithm.OnPolicyAlgorithm.learn the iteration value is not updated in the locals dictionary while using callbacks.
To Reproduce
from stable_baselines3 import PPO
def callback_function(v_locals, v_globals):
iteration_index = v_locals['iteration']
print(f'iteration_index={iteration_index}')
return True
checkpoint_callback = ConvertCallback(lambda x, y: callback_function(x, y))
model = PPO("MlpPolicy", env, verbose=1)
model.learn(total_timesteps=10_000, callback=[checkpoint_callback])
Relevant log output / Error message
iteration_index=0
iteration_index=0
iteration_index=0
----------------------------------------
| time/ | |
| fps | 69 |
| iterations | 3 |
| time_elapsed | 87 |
| total_timesteps | 6144 |
| train/ | |
| approx_kl | 0.00776526 |
| clip_fraction | 0.0327 |
| clip_range | 0.2 |
| entropy_loss | -1.37 |
| explained_variance | -0.000442 |
| learning_rate | 0.0003 |
| loss | 3.45e+04 |
| n_updates | 20 |
| policy_gradient_loss | -0.0103 |
| value_loss | 6.32e+04 |
----------------------------------------
iteration_index=0
iteration_index=0
iteration_index=0
System Info
- OS: Windows-11-10.0.22631-SP0 10.0.22631
- Python: 3.12.0
- Stable-Baselines3: 2.2.1
- PyTorch: 2.2.0+cpu
- GPU Enabled: False
- Numpy: 1.26.3
- Cloudpickle: 3.0.0
- Gymnasium: 0.29.1
Checklist
- [X] My issue does not relate to a custom gym environment. (Use the custom gym env template instead)
- [X] I have checked that there is no similar issue in the repo
- [X] I have read the documentation
- [X] I have provided a minimal and working example to reproduce the bug
- [X] I've used the markdown code blocks for both code and stack traces.
Hello,
if you want to have the number of iterations, the best is to have a counter that you increment as every call of on_rollout_end().