SimSwap icon indicating copy to clipboard operation
SimSwap copied to clipboard

I'm curious about the data and architecture of the pretrained Arcface model you used.

Open kyugorithm opened this issue 2 years ago • 3 comments

Hello. First of all, thank you for your wonderful results. I have a question about the pretrained arcFace model.

Consider when arcFace is used as an ID embedder. Depending on the trained data, training for swapping may not have been done well for a particular mode (race, gender, age), which is expected to have disproportionate conversion performance depending on race, gender, and age.

In fact, compared to the results of face swapping for Western samples, the results of face swapping for Asian samples seem to be very low in quality.(or there is a problem that even if I use Asian source image as input, the result is made like Western people)

To solve this problem, I would like to replace the model with the recently learned arcface.

It would be a great appreciation for me if you could tell me which data and which model architecture (Resnet 50, 100 etc.) were used to learn the model. For example, following github page shows arcFace performance for various conditions https://github.com/deepinsight/insightface/tree/master/recognition/arcface_torch#ms1mv3

Thank you for reading and I look forward to your cool answer!

kyugorithm avatar May 25 '22 14:05 kyugorithm

Thank you for your very suggestive issue. Can you be sure that these issues are related to arcface's model? In fact, most of the data in VGGFace2 are Western faces, and less than 10% of Asian faces

neuralchen avatar May 26 '22 15:05 neuralchen

Hey, im not sure if I understand your question; but you can always replace the detection and recognition model to a different one as long as they are compatible. As far as the Asian/Western face : it might be a visual bias. Certain faces with more prominent facial features which are more angular usually make for better swaps. The way the discriminator is trained is not bias to one or another, just a similarity quota. Certain recognition models can detect certain features with a higher rate of accuracy, but those can be easily downloaded from outside sources. Perhaps what you should try, is to find a more distinct and variant picture of the source and reassess the results. Simswap sacrifices a bit of identity similarities in exchange from for facial emotional variance, you will see what I mean if you compare the results to a face shifter model or something else. Good luck, apologies if my answer did not satisfy your question. Be well.

Fibonacci134 avatar May 27 '22 10:05 Fibonacci134

Hello, @neuralchen, @kyugorithm can you let me know which FR model was used for ID retrieval? If I want to replace or retrain the model then what are the restrictions on the output of the model?

usmancheema89 avatar Aug 31 '22 03:08 usmancheema89

@usmancheema89 I'm sorry for the late reply.

There are no special restrictions. Instead of the model previously provided by the authors, we used the model deployed in the path below the insightface. (You can check the instructions in the repo.) https://onedrive.live.com/?cid=4a83b6b633b029cc&id=4A83B6B633B029CC%215577&authkey=!AFZjr283nwZHqbA

We also used models learned directly from the WebFace42M dataset.

kyugorithm avatar Sep 27 '22 05:09 kyugorithm

@kyugorithm Thanks :)

usmancheema89 avatar Sep 27 '22 05:09 usmancheema89