Robert E. Hatem
Robert E. Hatem
Thanks for implementing this feature! Has anyone had issues using `all_points_membership_vectors` with a large dataset? (for me, original embeddings are 235002 x 384). It causes my Python kernel to fail....
g4dn.8xlarge   
> @hatemr I tried the BERTopic with GPU for almost 1M documents. My original embeddings are 1M x 384. I tried to get the probabilities of each topic for evary...
``` HDBSCAN(min_cluster_size=20, metric='euclidean', cluster_selection_method='eom', prediction_data=True) ``` Thanks, I'll try adjusting `min_samples`
> ``` > HDBSCAN(min_cluster_size=20, > metric='euclidean', > cluster_selection_method='eom', > prediction_data=True) > ``` > > Thanks, I'll try adjusting `min_samples` Increasing `min_samples` worked! I had to increase `min_samples` itself - increasing...