baselines
baselines copied to clipboard
How to run GAIL without environment
Hello,
I just have a expert trajectories and no Environment/Simulator for my problem.
I was able to convert my data into the required format
actions (126, 4) episode_starts (127,) rewards (0,) obs (126, 368) episode_returns (14,) Total trajectories: 10 Total transitions: 99 Average returns: 17.07973432764945 Std for returns: 13.725064365739867
I then try to run the GAIL implementation
model = GAIL('MlpPolicy', 'Pendulum-v0', dataset, verbose=1)
This requires an environment. I tried using None. That did not work.
Any suggestions or helps greatly appreciated !
Thanks!
You must have obtained your data from certain environment right? I think you should run it at least in an environment similar to the one from which you obtained your expert data. Anyway, if you don't mind could you share how did you format your expert data? I am quite confused just by looking at the downloaded expert data provided.
@zyzhang1130 Thank you for your reply. My data was obtained from a very complex environment that does not have a simulator. It is all observed data. I created the data from the observed data which is in the following format: id, time, observations, actions,reward. I have over 10,000 such rows of time series data. So, for each id I created an episode. Converted it into the required format.i.e., when a new id is encountered a new episode is started.
@bhavikajalli Conceptually an agent should interact with some environment. Because the expert data are essentially state-action pairs, without an environment, I feel it is not possible for the agent to learn a policy and to improve upon it. I am also having this issue of interfacing a complicated environment with this gail repo. I guess there is no easy way around it.
By the way did you follow this format when creating your expert data?{ 'ep_rets': np.array with shape (1500,), 'obs': np.array with shape (1500, T, O), 'rews': np.array with shape (1500, T), 'acs': np.array with shape (1500, T, A) }