deep-text-recognition-benchmark icon indicating copy to clipboard operation
deep-text-recognition-benchmark copied to clipboard

forward() missing 2 required positional arguments: 'input' and 'text'

Open dikers opened this issue 5 years ago • 4 comments

When I use demo.py predict my images, I modify 'batch_size' from 192->32,

'parser.add_argument('--batch_size', type=int, default=192, help='input batch size')' 'parser.add_argument('--batch_size', type=int, default=32, help='input batch size')'

The last cycle will report an error.

  for image_tensors, image_path_list in demo_loader:
         
            batch_size = image_tensors.size(0)
            image = image_tensors.to(device)
            # For max length prediction
            length_for_pred = torch.IntTensor([opt.batch_max_length] * batch_size).to(device)
            text_for_pred = torch.LongTensor(batch_size, opt.batch_max_length + 1).fill_(0).to(device)

            if 'CTC' in opt.Prediction:
                preds = model(image, text_for_pred)

                # Select max probabilty (greedy decoding) then decode index to character
                preds_size = torch.IntTensor([preds.size(1)] * batch_size)
                _, preds_index = preds.max(2)
                #FIXME:  edit by dikers issue (https://github.com/clovaai/deep-text-recognition-benchmark/issues/185)
                #preds_index = preds_index.view(-1)
                preds_str = converter.decode(preds_index.data, preds_size.data)

            else:
                preds = model(image, text_for_pred, is_train=False)

                # select max probabilty (greedy decoding) then decode index to character
                _, preds_index = preds.max(2)
                preds_str = converter.decode(preds_index, length_for_pred)

The following error results occurred:

Traceback (most recent call last):
  File "demo.py", line 138, in <module>
    demo(opt)
  File "demo.py", line 68, in demo
    preds = model(image, text_for_pred, is_train=False)
  File "/home/ec2-user/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ec2-user/.local/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 152, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/home/ec2-user/.local/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/home/ec2-user/.local/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
    output.reraise()
  File "/home/ec2-user/.local/lib/python3.6/site-packages/torch/_utils.py", line 369, in reraise
    raise self.exc_type(msg)
TypeError: Caught TypeError in replica 2 on device 2.
Original Traceback (most recent call last):
  File "/home/ec2-user/.local/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
    output = module(*input, **kwargs)
  File "/home/ec2-user/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
TypeError: forward() missing 2 required positional arguments: 'input' and 'text'

Look like, The number of pictures cannot be divisible by batch size, (123/32)

Please help me check it? Thank you

dikers avatar Jul 22 '20 06:07 dikers

Hello,

I have two questions to resolve this problem.

  1. Does this error occur when you try with batch_size 192? Even if you try with batch_size 192, you would divide (odd_number/192) and also cannot be divisible, so I want to check whether batch_size is the cause of the problem or not.

  2. Does this error occur when you use a single GPU? Maybe a single GPU would not occur this error.

Best

ku21fan avatar Jul 30 '20 11:07 ku21fan

Hi,

I am currently experiencing the same error during training:

if rgb:
    input_channel = 3
    
model = Model()
model = torch.nn.DataParallel(model, device_ids = [0])#.to('cuda:0')
model.load_state_dict(torch.load(saved_model, map_location=device), strict=False)

if 'CTC' in prediction:
    # Ignoring baiduCTC 
    converter = CTCLabelConverter(character)
    criterion = CTCLoss()
else:
    converter = AttnLabelConverter(character)
    criterion = torch.nn.CTCLoss(zero_infinity=True).to(device)

num_class = len(converter.character)

loss_avg = Averager()

filtered_parameters = []
params_num = []
for p in filter(lambda p: p.requires_grad, model.parameters()):
    filtered_parameters.append(p)
    params_num.append(np.prod(p.size()))
print('Trainable params num : ', sum(params_num))

optimizer = optim.Adam(filtered_parameters, lr=0.001)#, betas=(beta1, 0.999))

start_iter = 0
start_time = time.time()
best_accuracy = -1
best_norm_ED = -1
iteration = start_iter
torch.cuda.empty_cache()

# train loop
for img_path_batch, img_batch, text_batch, objects_batch in tqdm(data_ldr):
    image = img_batch.to(device)
    print(text_batch)
    text, length = converter.encode(text_batch, batch_max_length=batch_max_length)
    
    if 'CTC' in prediction:
        preds = model(image, text)
        preds_size = torch.IntTensor([preds.size(1) * batch_size])
        
        preds = preds.log_softmax(2).permute(1, 0, 2)
        cost = criterion(preds, text, preds_size, length)
    else:
        preds = model(image, text[:, :-1], objects_batch)  # align with Attention.forward
        target = text[:, 1:]  # without [GO] Symbol
        cost = criterion(preds.view(-1, preds.shape[-1]), target.contiguous().view(-1))
        
        model.zero_grad()
        cost.backward()
        torch.nn.utils.clip_grad_norm_(model.parameters(), 5)  # gradient clipping with 5 (Default)
        optimizer.step()
        
        loss_avg.add(cost)
    iteration += 1

'finish training'

Error:

TypeError                                 Traceback (most recent call last)
<ipython-input-60-2cfde1ea9c69> in <module>
     22         preds = model(image, text[:, :-1], objects_batch)  # align with Attention.forward
     23         target = text[:, 1:]  # without [GO] Symbol
---> 24         cost = criterion(preds.view(-1, preds.shape[-1]), target.contiguous().view(-1))
     25 
     26         model.zero_grad()

~\anaconda3\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *input, **kwargs)
    720             result = self._slow_forward(*input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(
    724                 _global_forward_hooks.values(),

TypeError: forward() missing 2 required positional arguments: 'input_lengths' and 'target_lengths'

I get this error with batch size of 192 and training on 1 GPU

I have realised I made a mistake when defining my criterion, I was using CTCLoss when I should have been using: criterion = torch.nn.CrossEntropyLoss(ignore_index=0).to(device)

JoshuaPlacidi avatar Dec 10 '20 20:12 JoshuaPlacidi

Facing the same problem.

GraceKafuu avatar May 20 '21 08:05 GraceKafuu

Maybe you just need to set drop_last=True in your DataLoader.

ttxskk avatar Aug 18 '21 07:08 ttxskk