ray
ray copied to clipboard
[RLlib] PolicyServerInput memory leak
What happened + What you expected to happen
PolicyServerInput
keeps on storing samples in its samples_queue
, no matter what the rate of sample generation is. If samples come in quicker than they are consumed, this leads to an ever-increasing memory usage. No warnings or anything were reported in the process of increasing memory. I had to debug why memory was ever-increasing and debugging the culprit took me several hours to track down.
At the very least I would expect some warning logging that the samples_queue
is reaching very high memory usage (500GB), in my case in under 24 hours.
This is a well-known producer-consumer problem, and one solution would be to implement some kind of back-pressure mechanism. Alternatively, the queue size could be reduced in size when a certain 'max-size' parameter is reached when adding new samples. As a work-around I had to reduce the number of clients generating samples.
Versions / Dependencies
Ray: 2.0.0 Python: 3.8.10 Ubuntu: 20.04.05
Reproduction script
I would expect you can reproduce this if you start from the Cartpole Server example. Create a big network, while setting num_sgd_iter
to a high number. Create multiple clients. Track the self.samples_queue
size in the PolicyServerInput
. If it starts to increase consistently, you will run out of memory.
Issue Severity
Medium: It is a significant difficulty but I can work around it.
@MattiasDC : PR in review :)
@MattiasDC : PR in review :)
Thanks for taking the time to fix this!
Sorry to necro an old threa, but I manually implemented the PR and it fixed the issue for me without a problem - is there anything we can do to get it pulled into the main?