Maarten Grootendorst

Results 8 issues of Maarten Grootendorst

## **Highlights**: * Online/incremental topic modeling with `.partial_fit` * Expose c-TF-IDF model for customization with `bertopic.vectorizers.CTfidfTransformer` * Several parameters were added for potentially improved representations * `bm25_weighting` * `reduce_frequent_words` *...

[Faiss](https://github.com/facebookresearch/faiss) allows you to efficiently search and cluster dense vectors. This could be beneficial when comparing the cosine similarities between vectors in the TF-IDF and Embeddings model. However, since it...

enhancement

* Added function to extract and pass word/document embeddings which should make fine-tuning much faster * Fix #71 ```python from keybert import KeyBERT kw_model = KeyBERT() doc_embeddings, word_embeddings = kw_model.extract_embeddings(docs)...

I am on a mission to collect **real-world use cases** of BERTopic, KeyBERT, and PolyFuzz. For that, I can use your help! Sharing your use case will drive development and...

I am on a mission to collect **real-world use cases** of BERTopic, KeyBERT, and PolyFuzz. For that, I can use your help! Sharing your use case will drive development and...

In a recent version of scikit-learn, I believe it was [v1.3](https://scikit-learn.org/1.5/whats_new/v1.3.html#id8), HDBSCAN was implemented with base functionality. Considering scikit-learn is already a requirement of BERTopic it stands to reason to...

enhancement
question