ElegantRL Issue with explore_one_env() and update

Issue with explore_one_env() and update_buffer()

Open richardhuo opened this issue 2 years ago • 3 comments

Test Evn: Colab, SAC, env_num == 1

-In train/run.py, line 93: trajectory, step = agent.explore_env(env, args.num_seed_steps * args.num_steps_per_episode, True) Error message: explore_one_env() takes 3 positional arguments but 4 were given

explore_one_env is defined in Agents/AgentBase.py, def explore_one_env(self, env, target_step: int) -> list: You can see that explore_one_env only can handle 3 parameters but 4 were given in run.py.

-another error message pop up with the same line in run.py trajectory, step = agent.explore_env(env, args.num_seed_steps * args.num_steps_per_episode, True) Error Message: too many values to unpack (expected 2)

explore_one_env func return trajectory without step

-in replay_buffer.py, line 131, states, rewards, masks, actions = [torch.cat(item, dim=0) for item in traj_items] Error message: ValueError: too many values to unpack (expected 4)

Jul 24 '22 01:07 richardhuo

We will update function explore_env and fix this bug.

cancel the max_step limit of trajectory
update the class ReplayBuffer

Jul 26 '22 03:07 Yonv1943

The #1, #2, issue can be resolved quickly if understand the backend logic. For the #3 issue, this can be resolved by:
original code: traj_items = list(map(list, zip(*traj_list))) changes to traj_items = list(map(list, zip(traj_list))) new issue found: -In run.py, function train_and_evaluate(args)， model SAC don't have reward_tracker or step_tracker -in run.py, function train_and_evaluate(args), Error message: You can see that evaluate_save_and_plot only can handle 4 parameters but 5 were given in run.py.

Aug 07 '22 11:08 richardhuo

Do you have a timetable or deadline to fix this issue? I can fix it in my local install but suggest this could be a fix in your master as it is quite a significant error

Aug 22 '22 00:08 edwarts

ElegantRL ElegantRL copied to clipboard

Issue with explore_one_env() and update_buffer()

ElegantRL
ElegantRL copied to clipboard