Top2Vec issues

Using a pre-trained doc2vec model

1

Hello! I have a text mining use-case with one overarching document set, consisting of many smaller sub-sets of documents. i want to train a topic model for each smaller sub-set...

SjoerdBraaksma

How to get all keywords for each topic in top2vec when the model shows the first topic has a length of 180 but only gives the first 50 words?

1

I'm using the top2vec library for topic modeling on a large corpus of text data. I've successfully generated topics using the library, but when I look at the keywords for...

davood-hadiannejad

So few topics in a huge dataset

1

Hi folks, i have a huge dataset in portuguese, with 30.000 "subset" like below, which subset is a row , but top2vec created only 7 topics for all dataset. What's...

GranamyrBR

Ability to use PaCMAP for Dimensionality reduction

1

I've recently been using PaCMap instead of UMAP to perform dimensionality reduction. The authors of [Understanding How Dimension Reduction Tools Work: An Empirical Approach to Deciphering t-SNE, UMAP, TriMap, and...

helenkimber

Possiibility to save into Top2Vec model the specified embedding model (i.e. universal-sentence-encoder-large)

1

I noticed that if I call `def save(self, file):` from the `Top2Vec` class, the embedding model does not get saved. This is because of the following lines: ``` # do...

WilliamBonvini

How to perform topic reduction?

1

How could I perform topic reduction on topic vectors to hierarchically group similar topics and reduce the number of topics discovered?

AlbertoDeBenedittis

update get_feature_names() to get_feature_names_out()

2

Since recent versions scikik-learn has changed the get_feature_names() function in his vectorizers to get_feature_names_out(). This change solves the issue #307

j6e

Top2Vec
Top2Vec copied to clipboard

Metadata

Using a pre-trained doc2vec model

How to get all keywords for each topic in top2vec when the model shows the first topic has a length of 180 but only gives the first 50 words?

So few topics in a huge dataset

Ability to use PaCMAP for Dimensionality reduction

Possiibility to save into Top2Vec model the specified embedding model (i.e. universal-sentence-encoder-large)

How to perform topic reduction?

update get_feature_names() to get_feature_names_out()

fix: dict object is not callable

Searching for similar documents, given a German-language text corpus

access topic/document/etc. vectors

← Metadata

Owner

Metadata

Top2Vec Top2Vec copied to clipboard

Metadata

← Metadata

Owner

Metadata

Top2Vec
Top2Vec copied to clipboard