FedScale icon indicating copy to clipboard operation
FedScale copied to clipboard

Tensorflow data loader example

Open ewenw opened this issue 3 years ago • 2 comments

The tensorflow client example is very helpful, but one suggestion for improvement is to add a tf Dataset based dataloader. I think the existing dataloaders are all using torch, and this client converts torch tensors into tf tensors, which is less efficient.

ewenw avatar Jul 13 '22 13:07 ewenw

Thanks for raising it up. Yes, the current dataloader is less efficient in my test, and I agree we'd better support naive tf dataloaders.

However, it may take a while to do so for all datasets.

  • Do you have some preference regarding which datasets/tasks to prioritize? One workaround, for now, is to prefetch the next batch in order to hide this latency;
  • Wondering whether tensorflow (or JAX) is dominant in your use? While FedScale can support tf, optimizing it is not our recent top-priority focus (we are actively optimizing for the scalability and various modules like communication layers), but we very welcome the contribution and are happy to help.

fanlai0990 avatar Jul 14 '22 07:07 fanlai0990

There is no preference on which dataset since we will be building a custom data loader for our internal dataset. I just think it would be useful to have an example for reference in examples/tensorflow_engine.

And yes, tensorflow will be dominant in our use case, and we are happy to contribute as we make progress!

ewenw avatar Jul 14 '22 20:07 ewenw