self-critical.pytorch icon indicating copy to clipboard operation
self-critical.pytorch copied to clipboard

Inference on model is failing when there is not detected object in bottom up features

Open gsrivas4 opened this issue 4 years ago • 3 comments

For my current evaluation I am using my own computed bottom-up features. In these bottom-up features some of the images have zero detected regions. When I run these bottom-up features on your repo, it fails for the batches where such an image is present. Giving below error:

Traceback (most recent call last):
  File "eval.py", line 176, in <module>
    vars(opt))
  File "/home/default/ephemeral_drive/work/image_captioning/object_relation_transformer_cloned/eval_utils.py", line 141, in eval_split
    seq = model(fc_feats, att_feats, att_masks, opt=eval_kwargs, mode='sample')[0].data
  File "/usr/local/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/default/ephemeral_drive/work/image_captioning/object_relation_transformer_cloned/models/CaptionModel.py", line 31, in forward
    return getattr(self, '_'+mode)(*args, **kwargs)
  File "/home/default/ephemeral_drive/work/image_captioning/object_relation_transformer_cloned/models/TransformerModel.py", line 505, in _sample
    return self._sample_beam(fc_feats, att_feats, att_masks, opt)
  File "/home/default/ephemeral_drive/work/image_captioning/object_relation_transformer_cloned/models/TransformerModel.py", line 427, in _sample_beam
    att_feats, seq, att_masks, seq_mask = self._prepare_feature(att_feats, att_masks)
  File "/home/default/ephemeral_drive/work/image_captioning/object_relation_transformer_cloned/models/TransformerModel.py", line 347, in _prepare_feature
    att_feats = pack_wrapper(self.att_embed, att_feats, att_masks)
  File "/home/default/ephemeral_drive/work/image_captioning/object_relation_transformer_cloned/models/AttModel.py", line 43, in pack_wrapper
    packed, inv_ix = sort_pack_padded_sequence(att_feats, att_masks.data.long().sum(1))
  File "/home/default/ephemeral_drive/work/image_captioning/object_relation_transformer_cloned/models/AttModel.py", line 31, in sort_pack_padded_sequence
    tmp = pack_padded_sequence(input[indices], sorted_lengths, batch_first=True)
  File "/usr/local/lib64/python3.6/site-packages/torch/nn/utils/rnn.py", line 244, in pack_padded_sequence
    _VF._pack_padded_sequence(input, lengths, batch_first)
RuntimeError: Length of all samples has to be greater than 0, but found an element in 'lengths' that is <= 0
Terminating BlobFetcher

In the batch that cause the error, the fifth image in the batch had zero detections.

I am yet not sure if the error is because of the images with no detected regions, but I wanted to ask if there a requirement that the images should have atleast one detected region. If there is such a requirement, if would be great if you can give me a direction on which part of your code should be amended to enable the model to process such images or skip such images.

gsrivas4 avatar Aug 21 '20 17:08 gsrivas4

You can replace pack_wrapper with:

def pack_wrapper(module, att_feats, att_masks):
        return module(att_feats)

Under default setting, there is no difference.

ruotianluo avatar Aug 21 '20 19:08 ruotianluo

When you mention under default settings, does that mean default settings for Transformer and can it be an issue for LSTMs etc.? I understand that the using sort_pack_padded_sequence and pad_unsort_packed_sequence functions help in increasing the performance of the LSTMs and do not affect the logic. Is that correct?

I tried to understand what the function pack_wrapper is trying to do. To deal with the situation where the batch has some images with no detected objects, I have added zero tensors in the output tensor for the images with no detected objects. Below is my code

def pack_wrapper(module, att_feats, att_masks):
    if att_masks is not None:
        import ipdb; ipdb.set_trace()
        # True when image has detected regions
        boolmask = att_feats.sum((1,2)) == 0
        if boolmask.sum() != 0:
            tmp_feats = att_feats[att_feats.sum((1,2))!=0]
            tmp_masks = att_masks[att_masks.sum(1)!=0]
            packed, inv_ix = sort_pack_padded_sequence(tmp_feats, tmp_masks.data.long().sum(1))
            processed_feats = pad_unsort_packed_sequence(PackedSequence(module(packed[0]), packed[1]), inv_ix)
            processed_feats_shape = processed_feats.shape
            result_vector = torch.empty([att_feats.shape[0]] + [ele for ele in processed_feats_shape[1:]])
            ii, jj = 0, 0
            for bb in boolmask:
                if not bb:
                    result_vector[ii] = processed_feats[jj]
                    jj += 1
                else:
                    result_vector[ii] = torch.zeros(processed_feats_shape[1:])
                ii += 1
            return result_vector
        else:
            packed, inv_ix = sort_pack_padded_sequence(att_feats, att_masks.data.long().sum(1))
            return pad_unsort_packed_sequence(PackedSequence(module(packed[0]), packed[1]), inv_ix)
    else:
        return module(att_feats)

Can you suggest if the edited code seems fine? If the edits don't seem fine or break some other aspect of the code, is doing return module(att_feats) advisable for all the batches of images or the batch which has images with no detected object.

gsrivas4 avatar Aug 23 '20 22:08 gsrivas4

The pad and unpad here has nothing to do with LSTM. The reason I have it, is because at some point I have a batchnorm layer in the att_embed and I don't want zero tensors affect the statistics so I unpad it to run the att_embed and pad it back after.

I would suggest just do return module(att_feats)

ruotianluo avatar Aug 23 '20 22:08 ruotianluo