scikit-dimension icon indicating copy to clipboard operation
scikit-dimension copied to clipboard

Recommendations on algorithm choice

Open e-pet opened this issue 6 months ago • 0 comments

Hi,

thank you for a great package! I'm new to ID estimation, hence a very naive question that I'm sure would properly merit a complex answer: is it possible to provide recommendations on which algorithms to try first for which use cases? E.g., if I plan to use UMAP+HDBSCAN clustering subsequently (a rather common choice nowadays, I believe) and my dataset has 5000 samples with 1000 features each, is there a starting recommendation? I'm happy to read around; I just found it hard to find simple recommendations for practitioners. I believe some (any) such guidance could greatly increase the value of the package, even if recommendations are less than perfect.

Cheers, Eike

e-pet avatar Aug 12 '24 19:08 e-pet