Jiayi Weng comments

Results 303 comments of


                                            Jiayi Weng

MultiAgentPolicyManager misses rewards

@benblack769 you can change any part of tianshou's code to make PR, and I'll try my best to offer help.

MultiAgentPolicyManager misses rewards

> what did you mean by " instead of letting MAPM changing buffer over and over again"? I mean to naturally support multi-agent environment, i.e., to fully utilize PettingZoo env....

How to support multi-agent reinforcement learning

> I don't think there is a need for multi-agent reinforcement learning at short term. To me the priority is to improve the current functionality. There is major flaws in...

How to support multi-agent reinforcement learning

> varied number of agents (each simulation has a different number of agents) Sorry about that, that is actually beyond the current scope. I haven't come up with a good...

How to support multi-agent reinforcement learning

hmm yep, you're right

How to support multi-agent reinforcement learning

Can you pass the same reference into MAPM, i.e., MultiAgentPolicyManager([policy_1, policy_1, policy_2])? I think that would be fine to some extend.

How to support multi-agent reinforcement learning

Exactly, that should be a tiny issue but I think it would be fine for agent to learn, though it is a little bit inefficient.

How to support multi-agent reinforcement learning

> I had to change the code for the specific algorithm I don't quite understand. Do you mean you use different algorithms in the same MAPM? Current implementation only accepts...

How to support multi-agent reinforcement learning

Yeah, you can definitely do that. In MAPM the only thing need to do is to re-organize the Batch to buffer-style data (reshape or flatten to let the 1st dim...

How to support multi-agent reinforcement learning

> Could I offer a thought? You might want to consider using the PettingZoo parallel API (it has two). The parallel API isn't significantly different what I understand you're proposing...