Matrix-Capsules-EM-Tensorflow
Matrix-Capsules-EM-Tensorflow copied to clipboard
Is r supposed to be the same between capsulses in layer L?
Thank you very much for your work. After printing r ,I find r is the same between capsulses in layer L. Is it reasonable?
r represents the hidden variables in EM algorithm. I think the number of r should be number_of_input_caps*number_of_output_caps. When r is initialized, all the values is the same, I think it will change after each e-step.
I also encountered this problem with the EM algorithm as it is specified in the paper - all output capsules center on the same cluster (I understand this is what @tttoaster means with r being "same between capsules").
In EM for Gaussian Mixture Models, one usually starts with random cluster centers and performs the E-Step first, calculating the membership variables (r in our case). In the paper, they initialize the membership variables with a constant value and then perform the M step first - which will naturally calculate the same mean for all capsules, since the membership is constant.
In order to fix this, one can either add some random noise on the initialization of r, or initialize the means randomly and perform the E-step first.
If you don't do this, you will always get uniform activations across all capsule types per layer - making the model equivalent to a single capsule type per layer.