capsule-networks
capsule-networks copied to clipboard
Is there any error about the softmax operation?
"The coupling coefficients between capsule i and all the capsules in the layer above sum to 1", softmax should be computed along the channel of capsules, and you computed along the channel of route nodes.
I think here is some problem too. The "dim" parameter for softmax should be 0 in my view of point.
I second that. I think it should be dim=0. However, it does not train successfully if i change it.
I agree with @mrjel . But when I let dim=0, changed line 55 to
self.route_weights = nn.Parameter(0.01 * torch.randn(num_capsules, num_route_nodes, in_channels, out_channels))
and removed line 108(maybe not necessary ), I got 99.27% acc on the test set (epoch 5).
@zzzz94 hi, Thanks for you nice suggestions. I wonder why we need to set route_weights a relatively lower values by multiplying 0.01?
When I set dim=0, and remove line 108, the net works like a random guess and the accuracy is ~10%. However, it works well with a lower weight for route_weights.
Looking forward for your response!
@zzzz94 Thanks for your solution firstly. @h982639009 notice that before setting dim = 0, the dimension of c_ij is 1132 where setting dim = 1, the dimension is 10. This small trick can make the input weights have similar magnitude. I think if you enlarge the learning rate at the start, this problem can be also solved.