D4RL
D4RL copied to clipboard
[Bug Report] Discard the last timestep in each trajectory cause a problem in check_envs.py
If you are submitting a bug report, please fill in the following details and use the tag [bug].
Describe the bug
The check_envs.py
script always fails with hammer-human-v0
.
Checking hammer-human-v0
/home/seraphli/.pyenv/versions/3.8.5/envs/dien/lib/python3.8/site-packages/gym/envs/registration.py:505: UserWarning: WARN: The environment hammer-human-v0 is out of date. You should consider upgrading to version `v1` with the environment ID `hammer-human-v1`.
logger.warn(
/home/seraphli/Workspace/Github/Others/d4rl/d4rl/hand_manipulation_suite/hammer_v0.py:14: UserWarning: This environment is deprecated. Please use the most recent version of this environment.
offline_env.OfflineEnv.__init__(self, **kwargs)
load datafile: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 207.43it/s]
Max episode steps: 200
(11310, 46) (11310, 26)
11310 samples
num terminals: 25
11264 samples
num terminals: 24
Traceback (most recent call last):
File "/home/seraphli/Workspace/Github/Others/d4rl/scripts/check_envs.py", line 95, in <module>
assert orig_terminals == np.sum(dset['terminals']), 'Qlearining terminals doesnt match original terminals'
AssertionError: Qlearining terminals doesnt match original terminals
Additional context
This is cause by the function qlearning_dataset
.
Checklist
- [x] I have checked that there is no similar issue in the repo (required)