D4RL
D4RL copied to clipboard
A single terminal flag in half cheetah environment
There seems to be a single terminal = true flag in each of the half cheetah datasets. Do you know why this is the case? The half cheetah gym environment never terminates and only ever has timeouts, so why would there be a terminal flag in one of the trajectories?