kmeans_pytorch icon indicating copy to clipboard operation
kmeans_pytorch copied to clipboard

center_shift=nan

Open eghouti opened this issue 5 years ago • 13 comments

Hello,

Sometimes I get center_shift=nan and I don't understand why and how can I fix this.

eghouti avatar Feb 11 '20 22:02 eghouti

center_shift can be a very large number when the centroids change a lot (in the initial iterations of the K-means algorithm). I am not sure why it would be nan though. Is it possible for you to reproduce the case when center_shift=nan?

subhadarship avatar Feb 11 '20 22:02 subhadarship

yes I obtained it several times

eghouti avatar Feb 11 '20 22:02 eghouti

It would be very useful if you can share the code snippet which you are using so that I can reproduce it at my end.

subhadarship avatar Feb 11 '20 22:02 subhadarship

I just took resnet18 and compute a kmeans on its weight (independently on each layer)

eghouti avatar Feb 11 '20 22:02 eghouti

So if I understand correctly, you are trying to cluster the weight vectors from different layers. Can you confirm?

clustering the weights of resnet sounds really cool btw !!

subhadarship avatar Feb 11 '20 23:02 subhadarship

yes this is exactly what I am trying to do

eghouti avatar Feb 11 '20 23:02 eghouti

@eghouti what distance metric are you using? euclidean or cosine?

subhadarship avatar Feb 15 '20 00:02 subhadarship

I used euclidean distance

eghouti avatar Feb 18 '20 01:02 eghouti

same problem reproduced every time image code here https://github.com/NIRVANALAN/Centroid_GCN python train.py --dataset cora --self-loop cluster logits produced by dgl(a grpah ml lib, easy to install)

NIRVANALAN avatar Feb 21 '20 12:02 NIRVANALAN

Toy problem failed, code as follow

import torch
import numpy as np
from kmeans_pytorch import kmeans

# data
data_size, dims, num_clusters = 1000, 1, 3
x = np.random.randn(data_size, dims) 
x[x<0.] = 0.
x = torch.from_numpy(x)

# kmeans
cluster_ids_x, cluster_centers = kmeans(
    X=x, num_clusters=num_clusters, distance='euclidean', device=torch.device('cuda:0')
)

HolmesShuan avatar Apr 07 '20 02:04 HolmesShuan

The problem comes from: if a cluster_center is an outliner, there are no neighbor points to calculate the mean point. To calculate the mean of none points results in Nan.

For Line80 in init.py

if torch.isnan(selected.mean(dim=0)).sum()==0:
    initial_state[index] = selected.mean(dim=0)

GenjiB avatar Apr 22 '20 05:04 GenjiB

In my case, initializing cluster_centers will obtain that error each time, while @GenjiB 's method does work.

YNNEKUW avatar May 03 '20 19:05 YNNEKUW

Was hitting this constantly trying to switch over from a working sklearn kmeans use case. Thanks for the workaround.

markrmiller avatar Jan 29 '24 19:01 markrmiller