rlpyt
rlpyt copied to clipboard
Diagnostics/NewCompletedTrajs 0 on some iterations
I'm using a SerialSampler with the default collector, and on some training iterations I'm getting 0 new completed trajectories along with 0 StepsInTrajWindow. Additionally, there simply isn't a DiscountedReturn line on the log during that iteration, along with all the other TrajInfo values.
Does anyone know what might cause this? I'm not getting any warnings or errors but I'm concerned I might be producing actions with nans in them.
I'm using the Hopper-v3 environment.
Hmm, how long is your config["runner"]["log_interval_steps"]
and how many environments are you using config["sampler"]["batch_B"]
? If the first of those is small, and the second is large, it could simply be that none of the environment instances hit the end of their trajectory since the last logging event?