AS-GCN
AS-GCN copied to clipboard
RuntimeError: shape '[-1, 3, 1, 25]' is invalid for input of size 3456
I did the warmup pretraining for 9 epochs (instead of your 10 epochs) and that worked okay, and wanted to continue with the train part but hit that error then on the 11th epoch (thus can train for the 10th epoch, but a magical error then happens on the 11th?, and can't permute the x_last)
[08.13.19|01:43:40] Training epoch: 9
AS-GCN/net/utils/adj_learn.py:11: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
soft_max_1d = F.softmax(trans_input)
[08.13.19|01:43:41] Iter 0 Done. | loss2: 1961.2838 | loss_nll: 1938.3290 | loss_kl: 22.9549 | lr: 0.000500
[08.13.19|01:44:21] Iter 100 Done. | loss2: 1593.5586 | loss_nll: 1572.3160 | loss_kl: 21.2426 | lr: 0.000500
.....
[08.13.19|02:33:32] Iter 7400 Done. | loss2: 1933.1755 | loss_nll: 1911.2896 | loss_kl: 21.8860 | lr: 0.000500
[08.13.19|02:34:12] Iter 7500 Done. | loss2: 1391.6711 | loss_nll: 1369.8772 | loss_kl: 21.7940 | lr: 0.000500
[08.13.19|02:34:17] mean_loss2: 1974.6090346069418
[08.13.19|02:34:17] mean_loss_nll: 1951.943839369832
[08.13.19|02:34:17] mean_loss_kl: 22.665194850461106
[08.13.19|02:34:17] Time consumption:
[08.13.19|02:34:17] Done.
[08.13.19|02:34:17] The model has been saved as ./work_dir/recognition/kinetics/AS_GCN/max_hop_4/lamda_05/epoch9_model1.pt.
[08.13.19|02:34:17] The model has been saved as ./work_dir/recognition/kinetics/AS_GCN/max_hop_4/lamda_05/epoch9_model2.pt.
[08.13.19|02:34:17] Eval epoch: 9
[08.13.19|02:36:22] mean_loss2: 2030.5040628313056
[08.13.19|02:36:22] mean_loss_nll: 2008.193604698859
[08.13.19|02:36:22] mean_loss_kl: 22.310456798226845
[08.13.19|02:36:22] Done.
[08.13.19|02:36:22] Training epoch: 10
Traceback (most recent call last):
File "main.py", line 30, in <module>
p.start()
File "AS-GCN/processor/processor.py", line 111, in start
self.train(training_A=False)
File "AS-GCN/processor/recognition.py", line 161, in train
x_class, pred, target = self.model1(data, target_data, data_last, A_batch, self.arg.lamda_act)
File "/home/petteri/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "AS-GCN/net/as_gcn.py", line 59, in forward
x_last = x_last.permute(0,4,1,2,3).contiguous().view(-1,3,1,25)
RuntimeError: shape '[-1, 3, 1, 25]' is invalid for input of size 3456
And I assume that the error is being propagated from the data generation script for which I used the npy generation code from 2s-AGCN implementation: https://github.com/lshiwjx/2s-AGCN/blob/master/data_gen/kinetics_gendata.py
As when looking at the size, is it so that the line is defined statically and the last 25 refers to the number of joints in the dataset, and when I use the kinetics instead of the NTU-RGBD this should be conditionally set as you do here for the openpose?
x_last = x_last.permute(0,4,1,2,3).contiguous().view(-1,3,1,25)
32 3 290 18 2 # (N, C, T, V, M)
x_last: torch.Size([32, 3, 1, 18, 2])
x_recon: torch.Size([32, 3, 290, 18])
x1: torch.Size([32, 3, 290, 18, 2])
x2: torch.Size([32, 2, 18, 3, 290])
x3: torch.Size([64, 54, 290])
printed like this
def forward(self, x, x_target, x_last, A_act, lamda_act):
N, C, T, V, M = x.size()
print(N, C, T, V, M)
print('x_last: ', x_last.shape)
x_recon = x[:,:,:,:,0] # [2N, 3, 300, 25]
print('x_recon: ', x_recon.shape)
print('x1: ', x.shape)
x = x.permute(0, 4, 3, 1, 2).contiguous() # [N, 2, 25, 3, 300]
print('x2: ', x.shape)
x = x.view(N * M, V * C, T) # [2N, 75, 300]
print('x3: ', x.shape)
x_last = x_last.permute(0,4,1,2,3).contiguous().view(-1,3,1,25)
print('x_last: ', x_last.shape)
And I found couple of other hard-coded 25s from your code, and after changing them to a variable containing 18 fixed my training problem
I changed all the key nodes in the network structure to 18, but there will still be the error of "different from the previous size". I would like to ask which files will be changed.
And I found couple of other hard-coded
25s from your code, and after changing them to a variable containing18fixed my training problem I changed all the key nodes in the network structure to 18, but there will still be the error of "different from the previous size". I would like to ask which files will be changed.