pytorch-deeplab-xception
pytorch-deeplab-xception copied to clipboard
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])
When I train my coco-style dataset by use : bash train_coco.sh then the errors as follows:
Namespace(backbone='resnet', base_size=513, batch_size=4, checkname='deeplab-resnet', crop_size=513, cuda=True, dataset='coco', epochs=10, eval_interval=1, freeze_bn=False, ft=False, gpu_ids=[0], loss_type='ce', lr=0.01, lr_scheduler='poly', momentum=0.9, nesterov=False, no_cuda=False, no_val=False, out_stride=16, resume=None, seed=1, start_epoch=0, sync_bn=False, test_batch_size=4, use_balanced_weights=False, use_sbd=True, weight_decay=0.0005, workers=4)
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
Using poly LR Scheduler!
Starting Epoch: 0
Total Epoches: 10
0%| | 0/1 [00:00<?, ?it/s]
=>Epoches 0, learning rate = 0.0100, previous best = 0.0000
Traceback (most recent call last):
File "train.py", line 306, in
@jfzhang95@lyd953621450 Hi, I have faced the same problem. Do you train your own datasets? I think the key is the label.
@lyd953621450 @HuangLian126 I think the reason is that the BatchNorm in the global_avg_pool requires the batch size to be larger than 1. If you have already set a batch size larger than 1 and still faces this problem, that is probably because the remainder of the training data number divided by the batch size is 1, which leads to the fact that there can always be one single training data in one epoch. In such a case, I suggest you set the drop_last flag in the dataloader as true to drop this last single training data.
set batch size bigger than 1
@hlwang1124 i have same issue,
how can i set the drop_last flag as true?,
also if i have problem in my dataset, can dataset trigger this issue??
python train.py --backbone xception --lr 0.0001 --epochs 10 --batch-size 2 --gpu-ids 0 --checkname deeplab-xception
Hi, I still got this error when my batch size is 2. Have you solved this problem? @YadongLau @kimsu1219 @HuangLian126