facenet-pytorch icon indicating copy to clipboard operation
facenet-pytorch copied to clipboard

MTCNN returns no detected faces for a batched tensor input but works for the same tensors transformed into a list of PIL images

Open ms337 opened this issue 2 years ago • 3 comments

Problem

We have our input images as a batched tensor reals of shape [B, C, H, W].

We feed this into mtcnn after permuting to change shape into [B, H, W, C] as follows:

faces = mtcnn(reals.premute(0, 2, 3, 1))

We get faces to be a list of Nones, i.e. faces == [None] * B

However, when we convert the same input images from the batch tensor to a PIL Images, it works. This is done as follows:

reals = [transforms.toPILImage()(t) for t in reals]

This basically creates a list of size B with each element being the PIL Image.

Here after faces = mtcnn(reals), we get a list of length B with the cropped image tensors which are the expected cropped images.

Does the MTCNN.forward() not support batched tensors as input?

Notes

  • Getting rid of the permute causes errors (I think MTCNN expects tensors to have the shape [B, H , W, C]
  • Checked input tenors by converting them to PIL images to see they are actually real images with faces. They are.
  • In the second case of giving input as [PIL.Image], the output list of tensors of mtcnn contains correct cropped images and this has been checked by visualizing the images.

ms337 avatar Aug 04 '21 04:08 ms337

I was struggling to make the code work by just feeding the images after permute too, it was only after I came across this issue that I tried the PIL conversion and it worked! I might dig up the code to see why this is only working with PIL images. Strange, I couldn't see anyone else pointing this out. This should be picked by almost every user.

AmitSharma1127 avatar Dec 08 '21 14:12 AmitSharma1127

I found that doing this gives the desired result: faces = mtcnn(reals.premute(0, 2, 3, 1)*255.0) It might be because of the data range.

JohnParkerLee avatar Jun 03 '22 17:06 JohnParkerLee

Also, the documentation is a little bit ambiguous here:

https://github.com/timesler/facenet-pytorch/blob/fa70227bd5f02209512f60bd10e7e66877fdb4f6/models/mtcnn.py#L281

Sir-Photch avatar Jul 20 '23 03:07 Sir-Photch