insightface
insightface copied to clipboard
Training to maximise Face Clustering performance
Hi
I'm looking to do some unsupervised face clustering with the face vectors produced by insightface. I have trained on a custom dataset (ms1m + 10,000 of my own images) and have tried clustering with both HDBSCAN and DBSCAN (using cosine distance as a metric), but seem to have much lower performance than dlib + chinese whisper clustering.
I have trained insightface with embedding_size = 256 to reduce complexity (hdbscan performs poorly in high dimensions), but am still getting poor results (0 clusters, 100s of clusters when there should be 10)
Has anyone tried training insightface to perform face clustering? If so, are there any steps that I can do to improve the performance?
Have you tried Chinese whispers with original insightface models? I have tried it some time ago and I recall it was working pretty well.
Hi @SthPhoenix!
I will try it now and report back. My issue with chinese whisper clustering is that the python implementation is a fairly slow, but will try it nonetheless.
Well, I can't seem to get any reasonable clustering done on my dataset using insightface and chinese whispers. It constantly gives everything its own cluster.
Eg 26161 images were put into 26159 clusters (using dlibs chinese whisper clustering)
Perhaps my dataset is just too difficult to cluster
Are you using normalized embeddings for clustering?
Yes, I used sklearn for normilzation (is there a better way?)
Here is the dataset if it helps (already aligned/resized to 112) - it includes masked people, but I've had some luck with other recognition methods. Was just hoping for an improvement.
Link is broken
For some reason now it's working ) I have tried your dataset with dlib Chinese whispers with threshold 0.85, it gives around 2500 clusters with a lot of outliers - clusters containing 1-5 faces. If you filter your dataset by embedding norm > 20 and remove all clusters below 5 faces it should be around 65 clusters. A bit modified python implementation from FaceNet repo gives around 350 clusters without any filtering, and something around 30 after, but tooks about 40-50 minutes to run.
I have tested both glintr100 and w600k_r50 embeddings, they perform almost identical, which is great since w600k_r50 is almost twice faster recognition model.
BTW, while looking through images in largest cluster, I have noticed this image:
And I just can't stop thinking about it, what the hell have happened there? )))
@SthPhoenix
Link is broken
For some reason Dropbox sent me an email saying that they would take the link down. Good to hear it is now working!
Thanks so much for taking the time to look into this!
Ahahahaha, I'm unsure where the images are from, I've just collected them by running RetinaFace over hours of CCTV footage.
What do you mean by If you filter your dataset by embedding norm > 20
?
The clusters produced by facenet appear kinda noisy, I was trying to improve on those. Here is a UMAP projection of the FaceNet clusters from this dataset
As you can see, the orange in the top-right is separated into 2-3 clusters, even though it is the same person. I will try projecting the chinese whispers clusters as well, see what I get.
What do you mean by If you filter your dataset by embedding norm > 20 ?
You can normalize embedding following way:
embedding_norm = np.linalg.norm(embedding)
normed_embedding = embedding / embedding_norm
Embedding norm might be used as additional face quality metric, since it's lower for crops having less meaningful features.
@SthPhoenix Sorry for abandoning this post for so long!
I'm a little confused about the embedding norm, are you saying to filter with embedding_norm > 20
, and then use normed_embedding
for clustering? The issue with that is that I need to build a classifier on the clustered data, so I would need to store the embedding_norm
and apply it to any input features (which might not be a real issue)
@SthPhoenix Sorry for abandoning this post for so long!
I'm a little confused about the embedding norm, are you saying to filter with
embedding_norm > 20
, and then usenormed_embedding
for clustering? The issue with that is that I need to build a classifier on the clustered data, so I would need to store theembedding_norm
and apply it to any input features (which might not be a real issue)
You can just store normed_embedding
and use it for classifying instead of original
@SthPhoenix Sorry for abandoning this post for so long! I'm a little confused about the embedding norm, are you saying to filter with
embedding_norm > 20
, and then usenormed_embedding
for clustering? The issue with that is that I need to build a classifier on the clustered data, so I would need to store theembedding_norm
and apply it to any input features (which might not be a real issue)You can just store
normed_embedding
and use it for classifying instead of original
But then how would you classify newly incoming points? Would you also have to use np.linalg.norm
on new point before passing into a trained classifier?
But then how would you classify newly incoming points? Would you also have to use
np.linalg.norm
on new point before passing into a trained classifier?
Exactlу! Though that's not that scary as you may think, it'll take some excessive time of course, but it's neglectably low.
I think I'm having some seemingly random issues with clustering insightface features. As you can see in the below image, sometimes the clustering is very good:
But then other times the clustering is very wrong:
My initial hypothesis is that there is an outlier in the feature vector space (*.npz
here if you're interested) that is causing the normalization to skew all of the features, but I'm not sure if that's entirely a possibility (is that even how normalization works?)
EDIT: Nevermind, the issue persists even when I don't normalize...
Normalization should not influence clustering, have you tried manually inspecting those clusters? Possibly in such cases input data is of lower quality.
I think I'm having some seemingly random issues with clustering insightface features. As you can see in the below image, sometimes the clustering is very good:
But then other times the clustering is very wrong:
My initial hypothesis is that there is an outlier in the feature vector space (
*.npz
here if you're interested) that is causing the normalization to skew all of the features, but I'm not sure if that's entirely a possibility (is that even how normalization works?)EDIT: Nevermind, the issue persists even when I don't normalize...
Hi @atoaster have you fixed this problem? And the above mentioned "The clusters produced by facenet ": does it refer to chinese whisper algo? Thank you~
Hi @Jar7
Basically I have resolved the problem by using regular normalization and HDBSCAN for clustering. I have also removed my custom 256D model and instead just used the default glint360K 512D model provided by insightface. It will occasionally still have issues, but all in all it is definitely far more accurate what I had been trying above!