GradCache
GradCache copied to clipboard
Gradient update is extremely slow
I am trying to train a Image-Text Contrastive learning model and I am using a Functional Approach. The number of grad steps are 32 and the batch size per step is 32 which makes the total batch size as 1024.
Any idea how to increase the speed while doing the gradient update ?