somoclu icon indicating copy to clipboard operation
somoclu copied to clipboard

clusterID of the original samples

Open isaac-you opened this issue 5 years ago • 3 comments

SOM clustering is a good customer segment method, and your somoclu make the method strong enough to deal with big data. Thank you so much. But when I have done the train process, I can only find clusterID for nodes or neurons, but there is no clusterID for the original samples. Besides your default cluster number is 8 for kmeans, so how can I set another cluster number? Thank you so much for your help.

isaac-you avatar Oct 17 '18 00:10 isaac-you

best matching units array do not tell me the ClusterID directly. when I do the experiment from https://somoclu.readthedocs.io/en/stable/example.html for the 150 random samples, the best matching units array just give me the a matrix of shape (150,2) , but no ClusterID, it is more like a coordinate for 150 samples in 2-D space. So how can I find the ClusterID for original 150 samples, thank you.

isaac-you avatar Oct 17 '18 01:10 isaac-you

请问我要怎么知道样本聚类后所属的具体种类呢

deepwindlee avatar Jul 04 '19 12:07 deepwindlee

Hi, @isaac-you, you can use best matching units as suggested in documentation.

bmus = som.get_bmus(som.get_surface_state(X))
cluster_labels = [som.clusters[bmu[0]][bmu[1]] for bmu in bmus]

However, I am still wondering why there is no such method in the library itself given that it already have clustering support.

Sitin avatar Jan 17 '21 19:01 Sitin