video-swin-transformer-pytorch icon indicating copy to clipboard operation
video-swin-transformer-pytorch copied to clipboard

The shape of the logits

Open neverUseThisName opened this issue 2 years ago • 1 comments

the output 'logits' are of shape (1,768,8,7,7), but it should be (batch, num_class). How to adapt the code to classify videos?

neverUseThisName avatar Jan 18 '22 08:01 neverUseThisName

The fc layer is defined in (https://github.com/SwinTransformer/Video-Swin-Transformer/tree/master/mmaction/models/recognizers)/base.py from the official implementation.

TheEighthDay avatar Mar 10 '22 07:03 TheEighthDay