dm_control
dm_control copied to clipboard
Non-deterministic environments
Hi, Im trying to use the example locomotion environments and finding that they are non-deterministic even when passing a random state.
Example code is provided below
import numpy as np
from dm_control.locomotion.examples import basic_cmu_2019, basic_rodent_2020, cmu_2020_tracking
for env_fn in (
basic_cmu_2019.cmu_humanoid_maze_forage,
basic_cmu_2019.cmu_humanoid_heterogeneous_forage,
basic_rodent_2020.rodent_maze_forage,
basic_rodent_2020.rodent_two_touch,
):
print(f'{env_fn.__name__}')
rng_1, rng_2 = np.random.RandomState(1), np.random.RandomState(1)
env_1, env_2 = env_fn(rng_1), env_fn(rng_2)
ts_1, ts_2 = env_1.reset(), env_2.reset()
obs_1 = ts_1.observation
obs_2 = ts_2.observation
assert obs_1.keys() == obs_2.keys()
for key in obs_1.keys():
x = obs_1[key]
y = obs_2[key]
if not np.allclose(x, y):
print(f"\tfail: {key}, error: {np.sum(np.abs(x - y))}")
Output
cmu_humanoid_maze_forage
fail: walker/egocentric_camera, 1043595
cmu_humanoid_heterogeneous_forage
fail: walker/egocentric_camera, 841373
rodent_maze_forage
fail: walker/egocentric_camera, 741803
rodent_two_touch
fail: walker/egocentric_camera, 531694
I have tested this code with all other dm-control environments and found for now that only these four give an issue Do anyone know a reason why these four fail? Where is the egoentric_camera data computed?