skypilot Add Imagenet Benchmark

To test out how well goofys performs for a real deep learning workload, I made this benchmark.

ImageNet Dataset Information

It is possible to reach a reasonable speed with s3 (image-based) training as the EBS by increasing the number of data load workers.

Screen Shot 2022-04-20 at 10 21 40 AM

More details can be found in result sheet / Wandb.

Apr 19 '22 22:04 Michaelvll

This is awesome, thanks for running this! Two thoughts:

Could you also post some stats on the dataset? Number of files, size of each file, directory hierarchy (if possible)?
Can we run a similar benchmark on a dataset which uses large binary files to store the dataset? Curious to see the numbers there..

Apr 21 '22 03:04 romilbhardwaj

Could you also post some stats on the dataset? Number of files, size of each file, directory hierarchy (if possible)?

Great suggestions! Added the information about the dataset in the description, with the link to the publicly accessible s3 bucket.

Can we run a similar benchmark on a dataset which uses large binary files to store the dataset? Curious to see the numbers there.

I think @michaelzhiluo has the ImageNet bucket with tf-records in it. Will figure out if we can run the PyTorch code with it later.

Apr 21 '22 06:04 Michaelvll