crnn.pytorch
crnn.pytorch copied to clipboard
Loss descent sppend differs greatly when using multi GPU and Single GPU with same learning rate
when I use 4 GPUs to train the model with batchsize 256 and learning rate 0.001, loss descent speed is very slow, but when using just single GPU with batchsize 64 and learning rate 0.001, it seems to converge very fast.
when I use 4 GPUs to train the model with batchsize 256 and learning rate 0.001, loss descent speed is very slow, but when using just single GPU with batchsize 64 and learning rate 0.001, it seems to converge very fast.
Can you train the model on multi GPU ??? can you tell me how? @Alex220284
the loss is normalizd by sample number, you should multiple lr by 4 to get the same coverage speed when you set batch size 256. refer this paper for detail https://arxiv.org/abs/1706.02677
when i use Muti GPU,test fuction val raise assert t.numel() == length.sum(), "texts with length: {} does not match declared length: {}".format(t.numel(), length.sum()) AssertionError: texts with length: 19328 does not match declared length: 38656
preds = crnn(image) image size is 128 ,but preds is 64,can you tell why? Thank you very much!
when i use Muti GPU,test fuction val raise assert t.numel() == length.sum(), "texts with length: {} does not match declared length: {}".format(t.numel(), length.sum()) AssertionError: texts with length: 19328 does not match declared length: 38656
preds = crnn(image) image size is 128 ,but preds is 64,can you tell why? Thank you very much!
Did you solve this problem? Thanks