sheeprl
sheeprl copied to clipboard
Question for training dreamer_v3 on InvertedPendulum-v4
I ran below command to train InvertedPendulum-v4,
MUJOCO_GL=egl python sheeprl.py exp=dreamer_v3 env=mujoco env.id=InvertedPendulum-v4 fabric.accelerator=cuda fabric.devices=1 fabric.strategy=ddp algo.mlp_keys.encoder=[state] algo.mlp_keys.decoder=[state] algo.cnn_keys.encoder=[] algo.cnn_keys.decoder=[]
but it the agent is not learning (expected to achieve 1000 score)
Rank-0: policy_step=829396, reward_env_1=4.0
Rank-0: policy_step=829400, reward_env_2=4.0
Rank-0: policy_step=829404, reward_env_0=4.0
Rank-0: policy_step=829404, reward_env_3=4.0
Rank-0: policy_step=829412, reward_env_1=4.0
Rank-0: policy_step=829416, reward_env_2=4.0
Am I missing something?
Hi @anthony0727, have you tried training with different seeds?