scikit-dimension
scikit-dimension copied to clipboard
Recommendations on algorithm choice
Hi,
thank you for a great package! I'm new to ID estimation, hence a very naive question that I'm sure would properly merit a complex answer: is it possible to provide recommendations on which algorithms to try first for which use cases? E.g., if I plan to use UMAP+HDBSCAN clustering subsequently (a rather common choice nowadays, I believe) and my dataset has 5000 samples with 1000 features each, is there a starting recommendation? I'm happy to read around; I just found it hard to find simple recommendations for practitioners. I believe some (any) such guidance could greatly increase the value of the package, even if recommendations are less than perfect.
Cheers, Eike