PyTorch-Counterfactual-Multi-Agent-Policy-Gradients-COMA Did you forget about the reward from the second agent?

Did you forget about the reward from the second agent?

Open FSNStefan opened this issue 4 years ago • 5 comments

Hi... I'm new with COMA and PyTorch. As I read your code, it's impressive and useful. However, I saw at line 165 you didn't include reward from the second agent. I would like to know why...

Jun 26 '20 17:06 FSNStefan

That is because in this environment, the two agents share the same reward. You can definitely record reward from other agents.

Jun 26 '20 17:06 Bigpig4396

That is because in this environment, the two agents share the same reward. You can definitely record reward from other agents.

Hi! If i change the environment, eg. env_CatchPigs, which the two agents have different reward, What should be stored in r_list[t] used in line 94? The sum of rewards of agent1 and agent2? I'm new with COMA too...

Aug 24 '20 02:08 SunnyWangGitHub

That is because in this environment, the two agents share the same reward. You can definitely record reward from other agents.

It seems that in this environment, the rewards of the two agents may be different at the end of an episode!

Jul 28 '22 01:07 hccz95

That is because in this environment, the two agents share the same reward. You can definitely record reward from other agents.

It seems that in this environment, the rewards of the two agents may be different at the end of an episode!

Seems so. But they surely can have different rewards.

Jul 28 '22 03:07 Bigpig4396

That is because in this environment, the two agents share the same reward. You can definitely record reward from other agents.

I tried to provide the two agents with the same reward (sum up their independent rewards) as in the paper of COMA, and your implementation still works!

Jul 28 '22 03:07 hccz95

PyTorch-Counterfactual-Multi-Agent-Policy-Gradients-COMA PyTorch-Counterfactual-Multi-Agent-Policy-Gradients-COMA copied to clipboard

Did you forget about the reward from the second agent?

PyTorch-Counterfactual-Multi-Agent-Policy-Gradients-COMA
PyTorch-Counterfactual-Multi-Agent-Policy-Gradients-COMA copied to clipboard