lerobot Success rate computation for diffusion_policy pushT is broken since the release of the V-1.0.0 of the gymnasium

Success rate computation for diffusion_policy pushT is broken since the release of the V-1.0.0 of the gymnasium

Open zimka opened this issue 4 months ago • 0 comments

System Info

- `lerobot` version: 0.1.0
- Platform: Linux-6.5.0-41-generic-x86_64-with-glibc2.35
- Python version: 3.10.15
- Huggingface_hub version: 0.25.2
- Dataset version: 3.0.1
- Numpy version: 1.26.4
- PyTorch version (GPU?): 2.4.1+cu121 (True)
- Cuda version: 12010
- Using GPU in script?: True

Information

[X] One of the scripts in the examples/ folder of LeRobot
[ ] My own task or dataset (give details below)

Reproduction

TLDR: I want to reproduce diffusion policy training for diffT with visual input, and I can see in wandb logs that loss, metrics and logged videos behave as expected, while success_rate during eval is strictly 0.

Reproduction steps (exactly as they are for pretrained model https://huggingface.co/lerobot/diffusion_pusht):

python lerobot/scripts/train.py hydra.run.dir=outputs/train/diffusion_pusht hydra.job.name=diffusion_pusht policy=diffusion env=pusht env.task=PushT-v0 dataset_repo_id=lerobot/pusht training.offline_steps=200000 training.save_freq=20000 training.eval_freq=10000 eval.n_episodes=50 wandb.enable=true wandb.disable_artifact=true device=cuda
python lerobot/scripts/eval.py -p outputs/train/diffusion_pusht/checkpoints/last/pretrained_model/

The output is: {'avg_sum_reward': 101.59729324556768, 'avg_max_reward': 0.9727046033094218, 'pc_success': 0.0, 'eval_s': 69.18956518173218, 'eval_ep_s': 1.3837913274765015}

The problem is: 'pc_success': 0.0 Here are the wandb logs: https://wandb.ai/zimka/lerobot/runs/r2qef1ii

Expected behavior

I expect 'pc_success' to be somwhere around 70, as it it

The issue is most likely caused by recent release of gymnasium: https://github.com/Farama-Foundation/Gymnasium/releases/tag/v1.0.0

It looks like the final_info key-value pair that is used in lerobot/scripts/eval.py is no longer returned from the env.step() method, effectively breaking the code for success_rate evaluation inside lerobot/scripts/eval.py

Oct 13 '24 16:10 zimka

lerobot lerobot copied to clipboard

Success rate computation for diffusion_policy pushT is broken since the release of the V-1.0.0 of the gymnasium

System Info

Information

Reproduction

Expected behavior

lerobot
lerobot copied to clipboard