on-policy icon indicating copy to clipboard operation
on-policy copied to clipboard

Questions on the episode length of 1000 on gfootball env instead of a maximum env limit of 400

Open DeeDive opened this issue 1 year ago • 1 comments

Dear authors,

Thank you for this work! Could you please address a question that confuses me? I notice that the gfootball env terminates at a maximum of 400 steps as stated in their paper. But I also notice that the training scripts of gfootball set an episode length of 1000. Can you explain your motivation on that? (football scripts e.g. see https://github.com/marlbenchmark/on-policy/blob/b21e0f743bd4516086825318452bb6927a33538d/onpolicy/scripts/train_football_scripts/train_football_ca_hard.sh#L14C16-L14C20)

Best!

DeeDive avatar Aug 22 '23 13:08 DeeDive

I know that the vec env will automatically reset the env when it encounters the done=True flag. But I would appreciate it if you address my questions that

  1. how do we typically set this length value, and
  2. why do you set it here to more than two times the maximally allowed episode length?

DeeDive avatar Aug 22 '23 13:08 DeeDive