PyTorch-Counterfactual-Multi-Agent-Policy-Gradients-COMA
PyTorch-Counterfactual-Multi-Agent-Policy-Gradients-COMA copied to clipboard
It seems that running `COMA2.py` results in memory leak, and the program takes all the memory after some episodes.
Hi... I'm new with COMA and PyTorch. As I read your code, it's impressive and useful. However, I saw at line 165 you didn't include reward from the second agent....
Hi, I am wondering if the oscillation of the training phase comes from the fact that you only include down-sampling layers in your actor nets, since in partially observable domains,...
Hi! I found that the env_FindGoals used in this repo is different from the env_FindGoals in your other repo. The environment used by this repo will only give positive rewards,...