info-nce-pytorch
info-nce-pytorch copied to clipboard
Minimizing this loss function will minimize or maximize mutual information?
I'm confused about this. NCEloss can determine the lower bound of mutual information. In this implementation, should NCEloss be minimized in order to increase mutual information?
A lower NCE loss means a higher lower bound of the mutual information. So minimizing the NCE loss will maximize the (lower bound of the) mutual information.
In my actual pytorch training, I take NCEloss as part of my loss function and observe the mutual information value calculated by sklearn's API (sklearn.metrics.mutual_info_score) during the training process. However, I found that as the NCE loss decreases, the mutual information also decreases. I can't think of a reason to explain this.
I think I figured it out. This method provided by sklearn cannot be used to calculate mutual information between continuous variables, thus leading to wrong results.