Logging Error: 'rollout_buffer' from DDPG training?
Windows 10, Python 3.11
Repeats indefinitely during training.
Repeats indefinitely during training. I am in the same situation as you. Because DRLAgent.train_model() is an encapsulated function of FinRL, it does not support all learn() parameters. You can see its source code: def train_model(self, model, tb_log_name, total_timesteps=50000): ... model.learn(total_timesteps=total_timesteps, tb_log_name=tb_log_name) ... You can replace it with the learn() function:
model_ddpg = agent.get_model("ddpg", model_kwargs=DDPG_PARAMS)
trained_ddpg = model_ddpg.learn(
total_timesteps=50000,
log_interval=1,
tb_log_name="ddpg"
)
add a param 'log_interval' to control log steps ,and troubleshoot problem quickly.
The problem seems to be that Off-policy algorithms like DDPG, SAC, TD3 use replay_buffer instead of on-policy algorithms. I modified finrl/agents/stablebaselines3/models.py and the problem disappear.
def _on_rollout_end(self) -> bool:
# This method is only called for on-policy algorithms (A2C, PPO)
# Off-policy algorithms (DDPG, SAC, TD3) use replay_buffer instead
try:
# Only on-policy algorithms have rollout_buffer
if "rollout_buffer" in self.locals:
rollout_buffer_rewards = self.locals["rollout_buffer"].rewards.flatten()
self.logger.record(
key="train/reward_min", value=min(rollout_buffer_rewards)
)
self.logger.record(
key="train/reward_mean", value=statistics.mean(rollout_buffer_rewards)
)
self.logger.record(
key="train/reward_max", value=max(rollout_buffer_rewards)
)
except BaseException as error:
# Handle the case where "rewards" is not found or other errors
self.logger.record(key="train/reward_min", value=None)
self.logger.record(key="train/reward_mean", value=None)
self.logger.record(key="train/reward_max", value=None)
print("Logging Error:", error)
return True