brax
brax copied to clipboard
Massively parallel rigidbody physics simulation on accelerator hardware.
Hi team, I'd like to propose a small but helpful modification to the PPO training setup in Brax. Currently, in `brax/training/agents/ppo/networks.py`, the action distribution used by default is hardcoded to...
I encountered this bug when running the MuJoCo Playground tutorial with the following command: `python learning/train_jax_ppo.py --env_name CartpoleBalance` The above command effectively runs `brax/training/agents/ppo/train.py`. I resolved the bug by referring...
This change ensures that the original system (env.sys) remains unchanged during JAX tracing. This prevents the UnexpectedTracerError that would occur when wrapping the same environment twice. Tests: - Training on...
Hi, I've encountered an `UnexpectedTracerError` when using Brax's PPO implementation with the mjx backend under specific conditions: **Conditions:** - No additional `eval_env` provided. - Using a `randomization_fn` that depends on...
Hi Brax Team, I am unsure rather to ask this in the MuJoCo repository or this, so please excuse me if this question is misplaced :) I am looking to...
I am trying to run this simple code: ``` from brax import envs env_name = 'ant' backend = 'mjx' env = envs.get_environment(env_name=env_name, backend=backend) print(env.observation_size) print(env.action_size) state = jax.jit(env.reset)(rng=jax.random.PRNGKey(seed=0)) ``` which...
Hello brax team, thanks for your implementation of bc, there are some code problems I found when using the bc. I tried to fix them with AI, but I think...
Hello brax team, recently I'm trying to train a humanoid robot to squat. To be honest, it's hard to train with pure ppo algorithm and self-defined reward functions. So, I'm...
In brax/envs/wrappers/training.py the EpisodeWrapper.step method correctly uses lax.scan to collect and sum rewards over action_repeat sub‑steps, but it never accumulates the corresponding per‑step state.metrics. After the scan it does: ```...
Hi, There appears to be a bug in the latest version of Brax when training and saving a policy using PPO `af646c6`. The error occurs during the save step of...