FinRL icon indicating copy to clipboard operation
FinRL copied to clipboard

Logging Error: 'rollout_buffer' from DDPG training?

Open windowshopr opened this issue 5 months ago • 2 comments

Windows 10, Python 3.11

Image Image

windowshopr avatar Jul 24 '25 01:07 windowshopr

Repeats indefinitely during training.

windowshopr avatar Jul 24 '25 01:07 windowshopr

Repeats indefinitely during training. I am in the same situation as you. Because DRLAgent.train_model() is an encapsulated function of FinRL, it does not support all learn() parameters. You can see its source code: def train_model(self, model, tb_log_name, total_timesteps=50000): ... model.learn(total_timesteps=total_timesteps, tb_log_name=tb_log_name) ... You can replace it with the learn() function:

model_ddpg = agent.get_model("ddpg", model_kwargs=DDPG_PARAMS) trained_ddpg = model_ddpg.learn( total_timesteps=50000, log_interval=1,
tb_log_name="ddpg" ) add a param 'log_interval' to control log steps ,and troubleshoot problem quickly.

pawnzhang avatar Aug 13 '25 07:08 pawnzhang

The problem seems to be that Off-policy algorithms like DDPG, SAC, TD3 use replay_buffer instead of on-policy algorithms. I modified finrl/agents/stablebaselines3/models.py and the problem disappear.

 def _on_rollout_end(self) -> bool:
        # This method is only called for on-policy algorithms (A2C, PPO)
        # Off-policy algorithms (DDPG, SAC, TD3) use replay_buffer instead
        try:
            # Only on-policy algorithms have rollout_buffer
            if "rollout_buffer" in self.locals:
                rollout_buffer_rewards = self.locals["rollout_buffer"].rewards.flatten()
                self.logger.record(
                    key="train/reward_min", value=min(rollout_buffer_rewards)
                )
                self.logger.record(
                    key="train/reward_mean", value=statistics.mean(rollout_buffer_rewards)
                )
                self.logger.record(
                    key="train/reward_max", value=max(rollout_buffer_rewards)
                )
        except BaseException as error:
            # Handle the case where "rewards" is not found or other errors
            self.logger.record(key="train/reward_min", value=None)
            self.logger.record(key="train/reward_mean", value=None)
            self.logger.record(key="train/reward_max", value=None)
            print("Logging Error:", error)
        return True

JuanFuriaz avatar Dec 16 '25 09:12 JuanFuriaz