muzero-general icon indicating copy to clipboard operation
muzero-general copied to clipboard

Keep replay buffer on disk (not in memory), allowing it to grow to any size.

Open me-unsolicited opened this issue 3 years ago • 2 comments

Hello, please consider this pull request which I implemented based on this comment.

Mainly I added a GameHistoryDao class which creates "replay_buffer.db" with a simple key->value table and stores the games there like a dictionary. If this approach is good, then it can be optimized further by separating reanalysed_predicted_root_values, priorities, and game_priority into their own columns so it can avoid serializing/deserializing the full observation history at each update. However, I think that would take more invasive changes to the existing code.

me-unsolicited avatar Apr 22 '21 05:04 me-unsolicited

Hi @me-unsolicited , Thank you for this new feature. After reviewing and testing the code, we found that it slows considerably the time of training. So we will have to wait to merge this PR until we find a way to speed the training keeping the replay buffer on disk.

ahainaut avatar Apr 25 '21 20:04 ahainaut

@ahainaut , Thanks for the feedback! I made some improvements and it runs much faster now.

Changes:

  1. Use SQL to efficiently sample from the prioritized replay buffer.
  2. Store priorities and predicted values in separate columns from the full object.

me-unsolicited avatar Apr 27 '21 01:04 me-unsolicited