RR28
RR28
I am also using cuml version of BERTopic and sadly, not getting probabilities. Is there any way to extract more than 30 words on the topic? As I have to...
> > Is there any way to extract more than 30 words on the topic? > > In the latest version of BERTopic, you can access the c-TF-IDF matrix with...
Hi MaartenGr, I want to apologize for asking numerous questions. I again have a question, I want to merge topics from two different datasets using the code below from Tips...
> @rubypnchl Merging topics from two different models is currently not possible. If you follow along with the description of [BERTopic's algorithm](https://maartengr.github.io/BERTopic/algorithm/algorithm.html) this becomes quickly clear. Namely, we would have...
> Could you share your code for training BERTopic? The code would help me identify where the issue might stem from. The parameter you refer to was not changed between...
> When you use `min_topic_size` you are essentially setting the `min_cluster_size` parameter in HDBSCAN. So if you are using a custom HDBSCAN model, `min_topic_size` is not used and replaced by...
Thank you for quick response! > > Thank you for the great work, I currently using BERTopic for one of my problem. I am facing issues while updating the topics...
> > new_topics = [np.argmax(prob) if max(prob) >= probability_threshold else -1 for prob in probs] > > topic_model.update_topics(abstracts, new_topics, vectorizer_model=vectorizer_model) > > It might indeed be the case here that...
> > My main aim is to get no outlier or least outlier but with good quality of topics, > > If you want minimal or no outliers, then I...
> I encountered the same issue with BERTopic using a large dataset. The way BERTopic uses the vectorizer somehow results in huge memory consumption. I suspect the reason for this...