kmeans_pytorch
kmeans_pytorch copied to clipboard
Error trying to cluster from numpy
Hi, I'm not really using pytorch, but I want to use balanced kmeans. My code is as follows:
from torch import from_numpy
from balanced_kmeans import kmeans_equal
...
# load X, a 23000x59 ndarray
n_cluster = 50
X_tensor = from_numpy(X)
choices, centers = kmeans_equal(X_tensor,
num_clusters=n_cluster,
cluster_size=X.shape[0] // n_cluster)
I get the following error:
RuntimeError: expand(torch.LongTensor{[59]}, size=[]): the number of sizes provided (0) must be greater or equal to the number of dimensions in the tensor (1)
Am I doing something wrong creating my tensor from numpy? I apologize because I am asking more of like a general pytorch question and not really specific to kmeans_pytorch (and tbh I'm a total pytorch newb!) Is there an example anywhere of using kmeans_equal on numpy data? I bet other people would find that useful. Thanks in advance for any tips you can provide!
I got a little farther by adding a batch dimension to my data since that seems to be expected by kmeans_equal. So now I use:
X_tensor = torch.reshape(torch.from_numpy(X), (1,X.shape[0], X.shape[1]))
But now I get this error:
line 165, in kmeans_equal
selected_ind = torch.argsort(cluster_positions, dim=-1)[:, :cluster_size]
IndexError: too many indices for tensor of dimension 1
Process finished with exit code 1