PyTorch-Counterfactual-Multi-Agent-Policy-Gradients-COMA Memory leak

Memory leak

Open hccz95 opened this issue 2 years ago • 3 comments

It seems that running COMA2.py results in memory leak, and the program takes all the memory after some episodes.

Jul 31 '22 03:07 hccz95

First check if it comes from COMA or envFindGoal. Simply run random actions on envFindGoal, if it works, the problem is from COMA.

Aug 01 '22 14:08 Bigpig4396

First check if it comes from COMA or envFindGoal. Simply run random actions on envFindGoal, if it works, the problem is from COMA.

It seems that memory leak is caused by function cross_prod and can be avoided by adding with torch.no_grad() before calling cross_prod.

And I need your help about another question! In the definition in FindGoals.pdf, an episode ends once any one agent reach its goal; while in env_FindGoals.py an episode ends only when the first agent arrives; And I supposed that the goal of 'FindGoals' task should be finding goals for both agents. The learned policies of your code only drive one agent to reach its goal while the other one only stays at its start position.

Aug 01 '22 16:08 hccz95

you can simply modify the last several lines of in step( ) function of env_FindGoals to determine when the episode should stop.

Aug 01 '22 20:08 Bigpig4396

PyTorch-Counterfactual-Multi-Agent-Policy-Gradients-COMA PyTorch-Counterfactual-Multi-Agent-Policy-Gradients-COMA copied to clipboard

Memory leak

PyTorch-Counterfactual-Multi-Agent-Policy-Gradients-COMA
PyTorch-Counterfactual-Multi-Agent-Policy-Gradients-COMA copied to clipboard