hdbscan
hdbscan copied to clipboard
Impact of data normalization to HDBSCAN algorithm?
When HDBSCAN building the minimum spanning tree, how does it calculate the core distance? Just use the numerical data? When two dimension data(2 columns of data) have very different number distribution, like col1 ranges from 0 to 1 and col2 ranges from 10000 to 20000, does is means col1 would have very little contribution in core distance calculation?