CenterTrack change the training image resolution

Hi, I want to train my own dataset, and I have transformed my data to MOT data format, and have trained it successfully. Now I want to change the input image resolution by command python main.py tracking --exp_id mot17_half --dataset mot --dataset_version 17halftrain --pre_hm --ltrb_amodal --same_aug --hm_disturb 0.05 --lost_disturb 0.4 --fp_disturb 0.1 --gpus 0,1 --load_model ../models/crowdhuman.pth --input_h 144 --input_w 960, but there are some error happens:

loading annotations into memory... Done (t=0.72s) creating index... index created! Creating video index! Loaded MOT 17halftrain train 13995 samples Starting training... tracking/mot17_halfTraceback (most recent call last): File "main.py", line 101, in main(opt) File "main.py", line 70, in main log_dict_train, _ = trainer.train(epoch, train_loader) File "/home/li/project/CenterTrack/src/lib/trainer.py", line 317, in train return self.run_epoch('train', epoch, data_loader) File "/home/li/project/CenterTrack/src/lib/trainer.py", line 149, in run_epoch output, loss, loss_stats = model_with_loss(batch) File "/home/li/anaconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, **kwargs) File "/home/li/anaconda3/envs/py36/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 143, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/home/li/anaconda3/envs/py36/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 153, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/home/li/anaconda3/envs/py36/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 83, in parallel_apply raise output File "/home/li/anaconda3/envs/py36/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 59, in _worker output = module(*input, **kwargs) File "/home/li/anaconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, **kwargs) File "/home/li/project/CenterTrack/src/lib/trainer.py", line 98, in forward outputs = self.model(batch['image'], pre_img, pre_hm) File "/home/li/anaconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, **kwargs) File "/home/li/project/CenterTrack/src/lib/model/networks/base_model.py", line 75, in forward feats = self.imgpre2feats(x, pre_img, pre_hm) File "/home/li/project/CenterTrack/src/lib/model/networks/dla.py", line 632, in imgpre2feats x = self.base(x, pre_img, pre_hm) File "/home/li/anaconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, **kwargs) File "/home/li/project/CenterTrack/src/lib/model/networks/dla.py", line 313, in forward x = getattr(self, 'level{}'.format(i))(x) File "/home/li/anaconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, **kwargs) File "/home/li/project/CenterTrack/src/lib/model/networks/dla.py", line 221, in forward x1 = self.tree1(x, residual) File "/home/li/anaconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, **kwargs) File "/home/li/project/CenterTrack/src/lib/model/networks/dla.py", line 63, in forward out += residual RuntimeError: The size of tensor a (5) must match the size of tensor b (4) at non-singleton dimension 2

Can you share some advice about this? Besh wishes!!

Jan 03 '21 12:01 yustaub

I have the same question.And could you tell me the image resolution you used in your training command?Must be 544 and 960?

Jan 09 '21 08:01 Hidehidden

Take a look at https://github.com/xingyizhou/CenterTrack/blob/d3d52145b71cb9797da2bfb78f0f1e88b286c871/src/lib/model/networks/dla.py#L305-L316 Here, in line 313, self.level{i} is called. For me, it broke when i was equal to 4. We can see that on the next line previous outputs are appended to the y. Let's take a look at those shapes:

ipdb> p list(map(lambda x: x.shape, y))
[torch.Size([4, 16, 1920, 1080]), torch.Size([4, 32, 960, 540]), torch.Size([4, 64, 480, 270]), torch.Size([4, 128, 240, 135])]

I can see here that I started with full-HD, and dimensions were reduced by a factor of 2 with every level. The last dimensions include 135, which is not divisible by 2. My error says RuntimeError: The size of tensor a (68) must match the size of tensor b (67) at non-singleton dimension 3. We can see that it arises because of the fact that 135 is odd. If we look here: https://github.com/xingyizhou/CenterTrack/blob/d3d52145b71cb9797da2bfb78f0f1e88b286c871/src/lib/model/networks/dla.py#L243-L255 We can see that there are 5 levels, and it seems that the image resolution should be dividable by 2 at least 5 times, which means it should be dividable by 2^5==32. Now the question is whether or not it's the desired behavior, and is there a workaround. I've tried to run main with another arch but stumbled upon #45. Probably it might be simpler to just change the size of images a little bit.

Jan 22 '21 16:01 anstadnik

Sorry for the delayed reply. Yes the input resolution should be dividable by 32 for DLA34.

Jan 30 '21 22:01 xingyizhou

I had one question regarding this issue only, If we give the input_h, input_w args during the training does it automatically resize the images?

Nov 18 '21 07:11 anidh

CenterTrack CenterTrack copied to clipboard

change the training image resolution

CenterTrack
CenterTrack copied to clipboard