PVT
PVT copied to clipboard
learning rate for AdamW
What could be the best learning rate if I have 1 GPU, and I set the samples_per_gpu=1 and workers_per_gpu=1?