HARL Bug: There isn't reset of the environment when training

Bug: There isn't reset of the environment when training

Open handleandwheel opened this issue 8 months ago • 1 comments

I noticed that with on policy algorithms, the data collection process is done in the run function in OnPolicyBaseRunner. However, in my experiments, I noticed that my environment would not be reset even if it already gives out done == True. Following this clue, I found out that there isn't a reset procudure in the run function or any functions called by it that handles the problem.

Jun 27 '24 09:06 handleandwheel

HARL HARL copied to clipboard

Bug: There isn't reset of the environment when training

HARL
HARL copied to clipboard