QDax
QDax copied to clipboard
PGAME Replay Buffer delete newest solutions
Hi :)
The current PGAME Replay Buffer is using jax.lax.dynamic_update_slice to add new transition to the replay buffer. However, this is not acting like a circular buffer, meaning that if a batch contain more transitions than the size remaining in the buffer, it would delete the more recent transitions instead of the oldest ones.