Top2Vec
Top2Vec copied to clipboard
Top2Vec learns jointly embedded topic, document and word vectors.
Hello! I have a text mining use-case with one overarching document set, consisting of many smaller sub-sets of documents. i want to train a topic model for each smaller sub-set...
I'm using the top2vec library for topic modeling on a large corpus of text data. I've successfully generated topics using the library, but when I look at the keywords for...
Hi folks, i have a huge dataset in portuguese, with 30.000 "subset" like below, which subset is a row , but top2vec created only 7 topics for all dataset. What's...
I've recently been using PaCMap instead of UMAP to perform dimensionality reduction. The authors of [Understanding How Dimension Reduction Tools Work: An Empirical Approach to Deciphering t-SNE, UMAP, TriMap, and...
I noticed that if I call `def save(self, file):` from the `Top2Vec` class, the embedding model does not get saved. This is because of the following lines: ``` # do...
How could I perform topic reduction on topic vectors to hierarchically group similar topics and reduce the number of topics discovered?
Since recent versions scikik-learn has changed the get_feature_names() function in his vectorizers to get_feature_names_out(). This change solves the issue #307
We have a corpus of documents in German language, and the task is to provide a list of top 10 similar documents, given a specific document ID. At the moment,...
First of all, great package! it is awesome to use! I was wondering if it is possible to access individual vectors on different levels of the model. For example, if...