Mava icon indicating copy to clipboard operation
Mava copied to clipboard

What tools do you use to store data data for distributed training

Open GoingMyWay opened this issue 2 years ago • 1 comments

Hi, I found you use reverb to store data. For distributed training, is reverb fast? For example, how much does time cost to get a batch from the remote replay buffer by using reverb?

GoingMyWay avatar Jul 07 '22 02:07 GoingMyWay

Hi @GoingMyWay We have been using Reverb as a replay buffer from the inception of MAVA so I cannot comment on alternatives. We also do not have a remote server i.e. we use localhost in MAVA.

In general, the performance can be improved: we did investigate/ benchmark how fast Reverb was for our use case (https://github.com/deepmind/reverb/issues/94), but this was more so for adding data to the server, not sampling (which has never been flagged/ slowed down our executors in the past). The tradeoff is that it is a robust repo that serves its purpose nicely.

Hope this helps

AsadJeewa avatar Jul 07 '22 10:07 AsadJeewa