D4RL icon indicating copy to clipboard operation
D4RL copied to clipboard

[Bug Report] Discard the last timestep in each trajectory cause a problem in check_envs.py

Open Seraphli opened this issue 1 year ago • 0 comments

If you are submitting a bug report, please fill in the following details and use the tag [bug].

Describe the bug The check_envs.py script always fails with hammer-human-v0.

Checking hammer-human-v0
/home/seraphli/.pyenv/versions/3.8.5/envs/dien/lib/python3.8/site-packages/gym/envs/registration.py:505: UserWarning: WARN: The environment hammer-human-v0 is out of date. You should consider upgrading to version `v1` with the environment ID `hammer-human-v1`.
  logger.warn(
/home/seraphli/Workspace/Github/Others/d4rl/d4rl/hand_manipulation_suite/hammer_v0.py:14: UserWarning: This environment is deprecated. Please use the most recent version of this environment.
  offline_env.OfflineEnv.__init__(self, **kwargs)
load datafile: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 207.43it/s]
         Max episode steps: 200
         (11310, 46) (11310, 26)
         11310 samples
         num terminals: 25
         11264 samples
         num terminals: 24
Traceback (most recent call last):
  File "/home/seraphli/Workspace/Github/Others/d4rl/scripts/check_envs.py", line 95, in <module>
    assert orig_terminals == np.sum(dset['terminals']), 'Qlearining terminals doesnt match original terminals'
AssertionError: Qlearining terminals doesnt match original terminals

Additional context This is cause by the function qlearning_dataset.

Checklist

  • [x] I have checked that there is no similar issue in the repo (required)

Seraphli avatar Mar 03 '23 15:03 Seraphli