facenet-pytorch
facenet-pytorch copied to clipboard
MTCNN returns no detected faces for a batched tensor input but works for the same tensors transformed into a list of PIL images
Problem
We have our input images as a batched tensor reals
of shape [B, C, H, W].
We feed this into mtcnn after permuting to change shape into [B, H, W, C] as follows:
faces = mtcnn(reals.premute(0, 2, 3, 1))
We get faces to be a list of Nones, i.e. faces == [None] * B
However, when we convert the same input images from the batch tensor to a PIL Images, it works. This is done as follows:
reals = [transforms.toPILImage()(t) for t in reals]
This basically creates a list of size B with each element being the PIL Image.
Here after faces = mtcnn(reals)
, we get a list of length B with the cropped image tensors which are the expected cropped images.
Does the MTCNN.forward() not support batched tensors as input?
Notes
- Getting rid of the permute causes errors (I think MTCNN expects tensors to have the shape [B, H , W, C]
- Checked input tenors by converting them to PIL images to see they are actually real images with faces. They are.
- In the second case of giving input as
[PIL.Image]
, the output list of tensors of mtcnn contains correct cropped images and this has been checked by visualizing the images.
I was struggling to make the code work by just feeding the images after permute too, it was only after I came across this issue that I tried the PIL conversion and it worked! I might dig up the code to see why this is only working with PIL images. Strange, I couldn't see anyone else pointing this out. This should be picked by almost every user.
I found that doing this gives the desired result:
faces = mtcnn(reals.premute(0, 2, 3, 1)*255.0)
It might be because of the data range.
Also, the documentation is a little bit ambiguous here:
https://github.com/timesler/facenet-pytorch/blob/fa70227bd5f02209512f60bd10e7e66877fdb4f6/models/mtcnn.py#L281