InfoPro-Pytorch
InfoPro-Pytorch copied to clipboard
Question about mutual information estimation
Thanks for your awesome work! I'm very interested in mutual information estimation used in your paper.
According to Appendix-G and your finding (figure-6), you train an auxiliary classifier to estimate I(h, y) and end2end supervised training retains all task-relevant information.
From my perspective, it shows that we don't need to build a classifier upon the final feature map and deploying the classifier (trained on feature maps from many layers) to the first feature map is enough.
I'm not sure I understand this correctly. Would you help me clarify this?