brax icon indicating copy to clipboard operation
brax copied to clipboard

[Bug] (and solution?) PPO policy training and saving fails in latest Brax version

Open ccdonosoo opened this issue 7 months ago • 0 comments

Hi,

There appears to be a bug in the latest version of Brax when training and saving a policy using PPO af646c6. The error occurs during the save step of training. However, loading an existing policy works fine.

Steps to Reproduce

  • Train a policy using the PPO agent.
  • Attempt to save the policy during training, using ppo.
  • An error is raised related to the observation_size, becuase it is not serializable to json (the obs_shape inside ppo.train() seems to be an Array).

Cause In brax/training/agents/ppo/checkpoint.py, line 53-55 is currently:

return checkpoint.network_config( observation_size, action_size, normalize_observations, network_factory )

The preliminary solution that I found is to change that to:

return checkpoint.network_config( observation_size.shape, action_size, normalize_observations, network_factory )

I tested SAC and does not have the issue with the "ant" environment.

Thank you in advance, regards!

ccdonosoo avatar Apr 30 '25 18:04 ccdonosoo