Maarten Grootendorst

Results 931 comments of Maarten Grootendorst

This is currently not implemented in BERTopic as the topics are not directly generated through word embeddings on which you can perform operations. Since BERTopic focuses on topic-word distributions it...

From your dataframe, it seems that you have assigned keywords to the wrong topic by indexing on the previous index. Take the image below: ![image](https://user-images.githubusercontent.com/25746895/171720982-020a25a1-a1f6-48a6-ac43-32b440126d36.png) Here, you can see that...

Without looking at the documents, my guess would be that the documents in topic 2 are mostly empty or very short such that the resulting topic representations are empty.

> It seems that if a model is created without specifying nr_topics then BERTtopic.hdbscan_model.labels_ will return the initial assignments. When you do not specify `nr_topics`, the topics in `BERTopic.hdbscan_model.labels_` will...

Without a bit more context, it is difficult to say what exactly is happening here. How many documents are you using for fitting and how many predicting? How many topics...

The `SentenceTransformer` model should automatically select the GPU if it can find one. To check whether a correct CUDA-enabled GPU can be found in your environment, it would be helpful...

Could you also share the output from the other lines of code (e.g., `toch.cuda.is_available`, `toch.cuda.current_device`, etc)? It might be worthwhile to check the performance of `sentence-transformers`. To do so, please...

I am not entirely sure what is happening but if it finds the device, it should be using it automatically seeing as CUDA is properly installed. Instead of using `fetch_20newsgroups`...

Thank you for the suggestion! I am not very familiar with document-level covariates as used in STM. However, reading through the [STM documentation ](https://cran.r-project.org/web/packages/stm/vignettes/stmVignette.pdf) there seems to be some overlap...

Thank you for sharing the paper, will make sure to read it through! It does seem like it would definitely be an interesting and useful extension of sorts to BERTopic....