on-policy
on-policy copied to clipboard
Questions on the episode length of 1000 on gfootball env instead of a maximum env limit of 400
Dear authors,
Thank you for this work! Could you please address a question that confuses me? I notice that the gfootball env terminates at a maximum of 400 steps as stated in their paper. But I also notice that the training scripts of gfootball set an episode length of 1000. Can you explain your motivation on that? (football scripts e.g. see https://github.com/marlbenchmark/on-policy/blob/b21e0f743bd4516086825318452bb6927a33538d/onpolicy/scripts/train_football_scripts/train_football_ca_hard.sh#L14C16-L14C20)
Best!
I know that the vec env will automatically reset the env when it encounters the done=True flag. But I would appreciate it if you address my questions that
- how do we typically set this length value, and
- why do you set it here to more than two times the maximally allowed episode length?