cytoself icon indicating copy to clipboard operation
cytoself copied to clipboard

Loss weights for optimal training

Open parthnatekar opened this issue 1 year ago • 1 comments

Hi,

I had a question about the optimal weighting of the losses during training.

I notice that initially if you weigh the fc loss, vq_loss, and reconstruction loss equally, the quantizer is not trained well enough to provide meaningful outputs, and the network never learns a good codebook because it is biased too much towards prediction. Reducing the prediction loss, on the other hand, makes the network learn a representation which is good for reconstruction/quantization but not prediction. I am unable to find a good balance.

Was the weighting of the losses modulated during training? What loss weights were optimal for training?

parthnatekar avatar Mar 01 '23 03:03 parthnatekar