hdbscan icon indicating copy to clipboard operation
hdbscan copied to clipboard

Strange clustering

Open MichaelMonashev opened this issue 3 years ago • 1 comments

Code:

import hdbscan
import numpy as np

x = np.array([
		[0.36789608, 0.17779213, 0.83797550, 0.77753013],
		[0.36628222, 0.17353597, 0.83745314, 0.78465497],
		[0.37088317, 0.17572623, 0.84084779, 0.78386849],
		[0.36569396, 0.17433393, 0.83739746, 0.78440967],
		[0.36793751, 0.17673337, 0.84037548, 0.78139651],
		[0.36722952, 0.17239252, 0.83743829, 0.78435159],
		[0.88804066, 0.81364667, 0.99931133, 1.        ], # outlier
		[0.36865044, 0.18000209, 0.83752632, 0.78532994],
		[0.36644703, 0.17631954, 0.83802074, 0.78327519],
	])

clusterer = hdbscan.HDBSCAN(
    min_cluster_size = 2, # can not change it
    min_samples = 1, # can not change it
)
clusterer.fit(x)

print(clusterer.labels_)

output:

[-1 -1 -1 -1 -1 -1 -1 -1 -1]

But I suspect something like this:

[0 0 0 0 0 0 -1 0 0]

How do I correct this behavior?

MichaelMonashev avatar Sep 06 '21 17:09 MichaelMonashev

Set allow_single_cluster=True?

mdruiter avatar Jan 25 '24 13:01 mdruiter

It works. Thank you.

MichaelMonashev avatar Mar 16 '24 07:03 MichaelMonashev