agent57_pytorch icon indicating copy to clipboard operation
agent57_pytorch copied to clipboard

RayOutOfMemoryError: More than 95% of the memory on node xxx is used

Open AptX395 opened this issue 2 years ago • 1 comments

Hello, thank you so much for the reproduction code!

I have encountered RayOutOfMemoryError when running the code. To address this issue, I have tried set num_agents from the default of 16 to 8 and 4 respectively, but this address remains unsolved. I don't know if the other parameters (such as num_rollout and num_arms) should be changed together.

My machine is a Linux server with 128GB of RAM and 4 2080-Ti GPUs. Could you please show me how to configure the parameters appropriately?

Looking forward to your reply, thanks!

AptX395 avatar Jun 15 '22 03:06 AptX395

I have tried setting num_agents, num_rollout, and num_arms to 2, 1, and 4, respectively. The RayOutOfMemoryError still remains. Why does the memory usage keep rising?

AptX395 avatar Jun 16 '22 02:06 AptX395