Mava
Mava copied to clipboard
[BUG]: Remove Reverb sampler from trainer tf.function.
Describe the bug The dataset sampler should still be removed from the MADDPG/MAD4PG tf.function training steps and placed outside. If the sampler is inside tf.function it can possibly sample random noise data, because tf.function messes with its stop calculations.
In my experience, having the reverb sample inside tf.function
is only a problem when you use a queue. So I expect it to be fine when using a regular replay buffer like in MADQN and MADDPG. Am I wrong? @DriesSmit
Closing all TF issues as we are depreciating our TF systems.