robosuite-benchmark icon indicating copy to clipboard operation
robosuite-benchmark copied to clipboard

Trained Rollout Scripts exhibit random behavior

Open rojas70 opened this issue 3 years ago • 4 comments

As part of the Visualization of Rollout Scripts, I ran the rollout.py script with different log files such as:

python scripts/rollout.py --load_dir runs/Door-Panda-OSC-POSE-SEED129/Door_Panda_OSC_POSE_SEED129_2020_09_21_20_07_20_0000--s-0/ --horizon 200 --camera frontview

However, for this and all other files that I tested, the robot agent seems to only exhibit random behavior.

From the documentation, I thought these would be well-trained agents. The documentation states:

We provide a rollout script for executing and visualizing rollouts using a trained agent model.

Am I making a mistake or missing something here? Thank you

rojas70 avatar Mar 09 '21 13:03 rojas70

@rojas70, it's been a while :) but did you figure this out? I'm having the same issue as you, and I'm trying to figure out what's wrong. Here is what I run, I didn't really install robosuite-benchmark nor rlkit :

import torch
import numpy as np
import robosuite
from robosuite.controllers import load_controller_config
from robosuite.wrappers import GymWrapper

has_renderer = True
N_episodes = 1

controller_config = load_controller_config(default_controller="OSC_POSE")

# create environment instance
env_s = robosuite.make(
    env_name="Lift", 
    robots="Panda",  
    gripper_types="default", 
    controller_configs=controller_config,   # arm is controlled using OSC
    has_renderer=has_renderer,
    has_offscreen_renderer=False, # no off-screen rendering
    control_freq=20,                        # 20 hz control for applied actions
    horizon=500,                            
    use_object_obs=True,              
    use_camera_obs=False,          
    reward_shaping=True,              
)

# Make sure we only pass in the proprio and object obs (no images)
keys = ["object-state"]
for idx in range(len(env_s.robots)):
    keys.append(f"robot{idx}_proprio-state")

# Wrap environment so it's compatible with Gym API
env = GymWrapper(env_s, keys=keys)

data = torch.load('Lift_Panda_OSC_POSE_SEED17_2020_09_13_00_26_56_0000--s-0/params.pkl', map_location=torch.device("cpu"))

policy = data['evaluation/policy']
policy.reset()

for i_episode in range(N_episodes):
    obs = env.reset()
    ret = 0.
    done = False
    i = 0
    while not done:
        action, agent_info = policy.get_action(obs)         # use observation to decide on an action
        obs, reward, done, info = env.step(action)  # take action in the environment
        ret += reward
        i += 1
        if(has_renderer):
            env.render()  # render on display

    print("rollout completed with return {}".format(ret))

env.close()
env_s.close()

sguysc avatar Aug 17 '22 15:08 sguysc

Hi @rojas70 @sguysc , were you able to figure out the issue? Thanks.

ivanangjx avatar Jan 31 '23 01:01 ivanangjx

@ivanangjx, I don't think that I have. I ended up training my own controller with RL. But, also, don't forget you need to set up a specific seed so that the simulation itself will be deterministic. That might be the problem.

np.random.seed(seed)

sguysc avatar Jan 31 '23 16:01 sguysc

Hi @sguysc , I followed your code and manually set the seed according to the folder name, but there still exhibit random behavior, do you have any idea? Thanks

junhui1997 avatar Dec 25 '23 11:12 junhui1997