DAIR-RCooper
DAIR-RCooper copied to clipboard
Model Training
I noticed the training method you mentioned in your paper, “They are trained for 50 epochs with a batch size of 16”. Do you use multiple cards in your training? I only have 24G of video memory on one of my graphics cards, so there is no way to train more batch sizes with one graphics card. Can you offer a bit of training tips not mentioned in the paper? The experiments I've reproduced so far are still a bit off from your published data. Very much looking forward to your reply!