torch_kmeans
torch_kmeans copied to clipboard
Fix Kmeans cluster updates issue
trafficstars
As this stackoverflow answer suggested, current groupd_by_label_mean function cannot work with clusters with zero data point assigned to them, causing possibly entire rows of M being 0, which will lead to NaN values when calling F.normalize() and propagate to all centers.
Fixed by creating masks for those empty clusters. Current solution will maintain those centers as the centers before current iteration. We can also set them to 0s if that's more aligned mathematically.