capsule-networks
capsule-networks copied to clipboard
Deviations of this implementation from the original paper
I have come across the following main deviations between the paper(https://arxiv.org/abs/1710.09829) and the implementation of this repo:
- In the paper: there are 32 primary capsule layer with 6x6 capsules each of 8 dimensions each. Hence, we needs to have 32 independent convolution layers with 8 output channels.
In the repo: it is implemented to be having 8 independent convolution layers with 32 output channels.
reference:
https://github.com/gram-ai/capsule-networks/blob/master/capsule_network.py#L90
- According to the paper, every primary capsule shall have a probability distribution to distribute its output to each of the digits capsule. so the total dimension of logits (a.k.a. probability) shall be [batch_size=100, num_primary_capsules=1152, num_digit_capsule=10]. But in the implementation, the dimension is [batch_size=100, num_primary_capsules=1152, num_digit_capsule=10, digit_capsule_dim=16]
https://github.com/gram-ai/capsule-networks/blob/master/capsule_network.py#L67
Please let me know your comments on this. #24
- That's correct
- Haven't double checked (yet).
Is there a need to redo the routing iterations(in digit capsules) for predict/evaluation/test?
@nitinsurya In my view, even during the prediction/test, the routing starts with equal probabilities for each primary capsule. So, we need these iterations even during test. what do you think?
I agree @InnovArul , got a chance to go through routing algorithm again, and realized, these are dynamically weighted, and are dependent on underlying capsule values, which change with each example. So, these route weights should be independent for each example, which happens via the current code.