Replaying data from Roboset FK1-v4(human) dataset with FK1_RelaxFixed-v4 environment.
Hello,
I'm trying to replay the Roboset FK1-v4(human) dataset and I'm facing problems with the new v4 kitchen environment. I'm able to replay the training data using the kitchen_relax-v1 environment from Relay Policy Learning, but am unable to replay them using the FK1_RelaxFixed-v4 environment. The arm moves seemingly random instead of following the trajectory from the data. Here are the code snippets with the functioning kitchen_relax-v1 replay code and the non-functioning FK1_RelaxFixed-v4 one. Thank you very much!
Apart from that, do you happen to have an estimated date for when the other multi task suites apart from the kitchen will release? Thanks!
FK1_RelaxFixed-v4, not working:
import torch
import h5py
import numpy as np
import gym
import time
from tqdm import tqdm
import robohive
torch.cuda.empty_cache()
trace = '/SNIP/datasets/human_demos_playdata/FK1_RelaxFixed_v2d-v4_60_20230506-111653_trace.h5'
with h5py.File(trace, 'r') as file:
h = dict()
# kettle to top left, bottom stove, right slider, left cupboard
for key in file['Trial60'].keys():
if key == 'env_infos':
h['qpos'] = file['Trial60/env_infos/state/qpos'][()]
h['qvel'] = file['Trial60/env_infos/state/qvel'][()]
continue
h[key] = file['Trial60'][key][()]
print('loaded 60')
actions = h['actions']
qpos = h['qpos'][0]
qvel = h['qvel'][0]
speedup = 1
env_name = 'FK1_RelaxFixed-v4'
# env_name = 'kitchen-v2'
env = gym.make(env_name)
env.reset()
init_qpos = qpos.copy()
init_qvel = qvel.copy()
env.sim.data.qpos[:] = init_qpos
env.sim.data.qvel[:] = init_qvel
env.sim.forward()
# pick scaling for actions
act_mid = np.zeros(env.sim.model.nu)
act_amp = 2 * np.ones(env.sim.model.nu)
env.mj_render()
obs = env.get_obs()
for i in tqdm(range(actions.shape[0] - 1)):
ctrl = actions[i]
# act = ctrl
act = act_mid + ctrl * act_amp
next_obs, reward, done, env_info = env.step(act)
# if i % render_skip == 0:
env.mj_render()
time.sleep(env.dt / speedup)
obs = next_obs
if done:
break
env.close()
kitchen_relax-v1, working:
import torch
import h5py
import numpy as np
import gym
import time
from tqdm import tqdm
import adept_envs.franka
torch.cuda.empty_cache()
trace = '/SNIP/datasets/human_demos_playdata/FK1_RelaxFixed_v2d-v4_60_20230506-111653_trace.h5'
with h5py.File(trace, 'r') as file:
h = dict()
# put kettle on top left, bottom stove, right slider, left cupboard
for key in file['Trial60'].keys():
if key == 'env_infos':
h['qpos'] = file['Trial60/env_infos/state/qpos'][()]
h['qvel'] = file['Trial60/env_infos/state/qvel'][()]
continue
h[key] = file['Trial60'][key][()]
print('loaded 60')
actions = h['actions']
qpos = h['qpos'][0]
qvel = h['qvel'][0]
speedup = 1
env = gym.make('kitchen_relax-v1')
env.reset()
init_qpos = qpos.copy()
init_qvel = qvel.copy()
env.sim.data.qpos[:init_qpos.shape[0]] = init_qpos
env.sim.data.qvel[:init_qvel.shape[0]] = init_qvel
env.sim.forward()
env.mj_render()
print(f'act_mid: {env.act_mid}, {env.act_mid.shape}\nact_amp: {env.act_amp}, {env.act_amp.shape}\nskip: {env.skip}\nframe_skip: {env.frame_skip}\nmodel.opt.timestep: {env.model.opt.timestep}\n')
for i in tqdm(range(actions.shape[0] - 1)):
act = actions[i]
observation, reward, done, info = env.step(act)
env.mj_render()
time.sleep((env.model.opt.timestep * env.frame_skip) / speedup)
if done:
break
env.close()
Thank you for your question. We are taking a look at this issue and will post updates here.
Does this issue occur for other expert or human (e.g. human_demos_by_task) datasets or only the human play datasets? Any additional context you can provide would be very helpful. Thank you for your patience!
Hello,
Thank you for your response. I've tried replaying the FK1_Knob1OnRandom-v4 dataset from human_demos_by_task, and had the same issue with it. I've also tried replaying the DAPG(human)/door_v2d-v4 dataset with its corresponding environment and this seemed to work without any problems. Here's a recording showing the Trace0 from the play dataset with the FK1_RelaxFixed-v4 env using the code snippet I've provided in my first post:
Screencast from 2023-12-15 13-40-00.webm
Thank you for your help.
Hi,
Thank you for the info. For replaying RoboHive datasets, you should be able to directly use the recorded 'actions' as actions as in here instead of scaling them as you did in the script you provided in your first post. Additionally, we have provided a script for replaying the datasets. The following command should replay the dataset successfully for FK1_Knob1OnRandom-v4 datasets:
python logger/examine_logs.py -e FK1_Knob1OnRandom_v2d-v4 -p <path to your dataset file>/FK1_Knob1OnRandom_v2d-v4_0_20230529-204609_trace.h5
Let us know if this solves the issue. Thanks!
Hi, does the script work for playdata?
for example, FK1_RelaxFixed_v2d-v4_0_20230506-110624_trace