distributed_tutorial
distributed_tutorial copied to clipboard
[Bug] Multiple dataset created in each train process
Hi, I don't if wrapping the dataset creation part inside the training function is a good idea... One possible issue is that when using multiple gpus the MNIST dataset is downloaded twice... Perhaps it's better to create a Dataset object in the main function and pass it into the train function, and create the distributed sampler within? Thanks for your help