PettingZoo icon indicating copy to clipboard operation
PettingZoo copied to clipboard

[in progress] Truncation update for mpe

Open WillDudley opened this issue 2 years ago • 2 comments

Summary

  • All MPE envs are subclasses of AECEnvs (I'm aware there's a proposal to depreciate AECEnvs for ParallelEnvs)
  • Hence AECEnvs and corresponding utils have to be refactored
  • Once done with the above, a naive test ran without issue*
  • Obviously the tests now failed, and I tried modifying them to account for truncation
  • The reworked tests for MPE envs work with exception of two bugs (see the #todo lines ), which are currently ignored
  • I tried refactoring magent envs as well but I am too unfamiliar with their logic at present.

Changelist

  • _dones_step_first() -> deads_change_first() - "done" depreciated to avoid confusion with old versions of PZ/Gym, replaced with "dead"
  • dones -> terminations & truncations - generally, the done boolean is taken to now be termination or truncation
  • done = True -> truncation = True where timeout occurs
  • outputs of env.last() changed to return terminated and truncated instead of done
  • relevant docstring changes
  • ! 2 tests removed ! (they were failing, idk why, hopefully other PRs will shed light on these)

This is quite a rough PR atm. I'd like to sort out the two bugs and maybe have more tests that verify that no truncated or terminated agent can be alive. But yes, it seems like refactoring MPE isn't isolated and affects all AECEnvs. Hence we should probably:

  1. Decide whether we want to keep working with AECEnvs for now or just switch to ParallelEnvs already,
  2. Agree on syntax for base environments before seriously working on our individual types of environment (butterfly, magent etc.)
  • code:
import time

from pettingzoo.mpe import simple_v2, simple_adversary_v2, simple_crypto_v2, simple_push_v2, simple_reference_v2, \
    simple_speaker_listener_v3, simple_spread_v2, simple_tag_v2, simple_world_comm_v2

env_objects = [simple_v2, simple_adversary_v2, simple_crypto_v2, simple_push_v2, simple_reference_v2, simple_speaker_listener_v3, simple_spread_v2, simple_tag_v2, simple_world_comm_v2]

for env_object in env_objects:
    env = env_object.env()
    env.reset()
    for agent in env.agent_iter():
        env.render()
        env.step(env.action_space(agent).sample() if not (env.terminations[agent] or env.truncations[agent]) else None)
        #time.sleep(0.1)
    env.close()

WillDudley avatar Aug 14 '22 00:08 WillDudley

Well, this is actually a pretty good PR, I never expected the tests to also get fixed (along with some other non-MAgent envs).

AECEnvs will definitely still stay around for the time being.

I agree that syntax for the base environments have to be decided on first, but the problem is that base environments don't apply to all the different groups of environments, hence what I was proposing was have one person make a PR per environment group, and we can then decide from the PR.

jjshoots avatar Aug 15 '22 16:08 jjshoots

Thanks. I couldn't see a way of neatly refactoring MPE envs without refactoring the utils (and the tests for my own sanity).

You can think of this as the PR for MPE envs then. There just may be some merge conflicts with utils and tests with other PRs. But it shouldn't be that difficult to resolve these conflicts as you were saying.

Will review the code over myself and then tag as ready for review.

WillDudley avatar Aug 15 '22 19:08 WillDudley

Thanks for the review! Would appreciate a second glance over my latest commits if poss :)

WillDudley avatar Aug 16 '22 23:08 WillDudley

Brilliant, thanks so much for your assistance! I'm happy that the tests now pass in the mpe environments.

Obviously this API update is bigger than this PR, and I'd be happy to consult on other envs (though I've got other stuff to focus on in the immediate future).

WillDudley avatar Aug 17 '22 20:08 WillDudley

@WillDudley Thanks for the help! I'm looking to do the same for the SISL environments actually, but I can't figure out where the step function returns things (though I haven't looked at it too thoroughly), you could look at that if you have the time?

The goal after the truncation steps is to update all the tests and merge into the master branch before we go on to render API change.

jjshoots avatar Aug 17 '22 21:08 jjshoots