faiss icon indicating copy to clipboard operation
faiss copied to clipboard

Faiss KMeans Inertia

Open mertozlutiras-hyrd opened this issue 3 years ago • 3 comments

I'm trying to see the inertia values of my clustering (sum of squared errors). I've seen several people using:

faiss.KMeans.obj[-1] as a measure of inertia value for KMeans.

However this value is always increasing for me with highher number of clusters, which is unexpected (should be decreasing).

This '.obj' attribute is defined as 'iteration_stats' in _swigfaiss_avx2 file as following:

iteration_stats = property(_swigfaiss_avx2.ProgressiveDimClustering_iteration_stats_get,_swigfaiss_avx2.ProgressiveDimClustering_iteration_stats_set, doc=r""" stats at every iteration of clustering""")

What does 'stats at every iteration of clustering' stand for here? Is it correct to use the last element of this array as the inertia value or should it be accessed by another variable?

mertozlutiras-hyrd avatar Oct 05 '22 11:10 mertozlutiras-hyrd

see stats here https://github.com/facebookresearch/faiss/blob/main/faiss/Clustering.h#L43

mdouze avatar Oct 06 '22 12:10 mdouze

Thanks for the answer. It seems like obj[-1] should be corresponding to the SSE (or inertia).

Then do you have an idea about why it is not decreasing with higher number of clusters? I've seen it reported by many others. @mdouze

ghost avatar Oct 06 '22 12:10 ghost

@mdouze Could you tell me if this is an expected behavior?

ghost avatar Oct 11 '22 10:10 ghost

@mdouze If I understand correctly, the obj is the squared-sum of all sample errors. But if I pass the weights parameter to KMeans.train, the objective matches neither the weighted (scikit-learn inertia) nor un-weighted (seems like so in faiss source), which is confusing.

Weighted: 7.508362767330208
Un-weighted: 7.685895746353934
obj.min() == obj[-1]: 7.669065475463867

jjyyxx avatar Jul 15 '23 07:07 jjyyxx

Running into the same issue here. obj[-1] is increasing as the number of clusters increase.

What am I getting wrong? Is there a way to get inertia from the kmeans object?

ckolluru avatar May 07 '24 22:05 ckolluru

+1

srggrs avatar May 24 '24 00:05 srggrs

This solved my issue. It was related to min and max points per centroid.

https://github.com/facebookresearch/faiss/issues/1887#issue-892534946

ckolluru avatar May 24 '24 02:05 ckolluru