Matrix-Capsules-EM-PyTorch
Matrix-Capsules-EM-PyTorch copied to clipboard
ConvCaps p_in a_in view
I think there is an issue in the way the input tensor x
is reshaped in order to extract a_in
and p_in
.
It seems to me that the dimensions of a_in
and p_in
require a permutation before applying Tensor.view()
.
Note that I changed the training batch size to 16, also I am using A, B, C, D = 32, 32, 32, 32
.
Transformation before view:
After this line: https://github.com/yl-1993/Matrix-Capsules-EM-PyTorch/blob/9cb3fc04442b937a1769d6d62e86d1754b7d9083/model/capsules.py#L253 I get this:
p_in.shape
Out[2]: torch.Size([16, 3, 3, 6, 6, 512])
View:
The view is done in the following way: https://github.com/yl-1993/Matrix-Capsules-EM-PyTorch/blob/9cb3fc04442b937a1769d6d62e86d1754b7d9083/model/capsules.py#L255
To do the view in this way, p_in.shape
should be:
torch.Size([16, 6, 6, 3, 3, 512])
Do you agree? I am new to Pytorch, so I might misunderstand the way Tensor.view() works.
@tomahawk810
Sorry about the late reply, as I am busy preparing cvpr submission.
As stated in https://github.com/yl-1993/Matrix-Capsules-EM-PyTorch/blob/9cb3fc04442b937a1769d6d62e86d1754b7d9083/model/capsules.py#L192
The shape of x
after adding patches is (b, H', W', K, K, B*(P*P+1)).
Therefore, the shape of p_in
is (b, H', W', K, K, B*P*P).
For view
, actually it can be split into two steps:
p_in = p_in.view(b*H'*W', K*K, B, P*P)
p_in = p_in.view(b*H'*W', K*K, B*P*P)
@tomahawk810 I get your point now.
I check the code and you are right about the output shape.
The issue lies in the add_patches
function.
I have fixed the problem now. Could you please help review the PR https://github.com/yl-1993/Matrix-Capsules-EM-PyTorch/pull/4?
Thanks for pointing out!
@tomahawk810 I merge the PR since there is no comment for two weeks. Please continue to comment if you feel anything is weird.