VIBNet
VIBNet copied to clipboard
question about layer-by-layer IB loss
trafficstars
Hi, there is a simple network feeded with a random input x:
x --> A --> B--> classifier --> y^hat
I want to squeeze x through layer A and B, then get logit vector from the classifier and finally output predictive label y^hat. What is the IB loss for each layer? Is it L = I(B;A)-I(A;y) + I(classifier;B)-I(classifier;y)? Why or why not? Thanks!!