MVP_Benchmark Training on multiple gpus

Training on multiple gpus

Open chenzhik opened this issue 2 years ago • 1 comments

Dear Authors,

I noticed that you train the model for completion "using the Adam optimizer with initial learning rate 1e−4 (decayed by 0.7 every 40 epochs) and batch size 32 by NVIDIA TITAN Xp GPU" mentioned in your paper. Did you mean that you used a single GPU for training or more GPUs?

Thanks for your work and look forward to your favourable reply.

Jan 08 '23 09:01 chenzhik

MVP_Benchmark MVP_Benchmark copied to clipboard

Training on multiple gpus

MVP_Benchmark
MVP_Benchmark copied to clipboard