MVP_Benchmark
MVP_Benchmark copied to clipboard
Training on multiple gpus
Dear Authors,
I noticed that you train the model for completion "using the Adam optimizer with initial learning rate 1e−4 (decayed by 0.7 every 40 epochs) and batch size 32 by NVIDIA TITAN Xp GPU" mentioned in your paper. Did you mean that you used a single GPU for training or more GPUs?
Thanks for your work and look forward to your favourable reply.