MixMIM
MixMIM copied to clipboard
Hardware & training time for pretraining
What's the hardware & training time? Specifically, I'm interested in the statistics below:
- Arch & Epoch: [e.g. ViT-B, 300 ep]
- Hardware: [e.g. single 8 V100 node]
- Batch time: [e.g. 0.8s]
- Epoch time: [e.g. 12 mins 30s ]
- Training time: [e.g. 28 h]