ElegantRL
ElegantRL copied to clipboard
Issue with explore_one_env() and update_buffer()
Test Evn: Colab, SAC, env_num == 1
-In train/run.py, line 93: trajectory, step = agent.explore_env(env, args.num_seed_steps * args.num_steps_per_episode, True) Error message: explore_one_env() takes 3 positional arguments but 4 were given
explore_one_env is defined in Agents/AgentBase.py, def explore_one_env(self, env, target_step: int) -> list: You can see that explore_one_env only can handle 3 parameters but 4 were given in run.py.
-another error message pop up with the same line in run.py trajectory, step = agent.explore_env(env, args.num_seed_steps * args.num_steps_per_episode, True) Error Message: too many values to unpack (expected 2)
explore_one_env func return trajectory without step
-in replay_buffer.py, line 131, states, rewards, masks, actions = [torch.cat(item, dim=0) for item in traj_items] Error message: ValueError: too many values to unpack (expected 4)
We will update function explore_env
and fix this bug.
- cancel the
max_step
limit of trajectory - update the
class ReplayBuffer
The #1, #2, issue can be resolved quickly if understand the backend logic.
For the #3 issue, this can be resolved by:
original code: traj_items = list(map(list, zip(*traj_list)))
changes to traj_items = list(map(list, zip(traj_list)))
new issue found:
-In run.py, function train_and_evaluate(args), model SAC don't have reward_tracker or step_tracker
-in run.py, function train_and_evaluate(args), Error message: You can see that evaluate_save_and_plot only can handle 4 parameters but 5 were given in run.py.
Do you have a timetable or deadline to fix this issue? I can fix it in my local install but suggest this could be a fix in your master as it is quite a significant error