HAT
HAT copied to clipboard
Should I change the lr if I use 4 GPUs while retaining the same total batchsize, i.e., 8 for each GPU? Thanks
By the way, could you pls tell me if you try to train the network with 4 A100 gpus? If so, can each gpu accommodate 8 batchsize? Thanks.
@morgen-star You don't need to change the learning rate when keeping the total batch size the same. For basic version of HAT, 4 A100 GPUs with batch size per gpu of 8 is OK.
If I use 2 GPUs with 8 Batch per GPU (total Batch = 16), does the learning rate need to be adjusted? Thank you for your answer!