Maarten Grootendorst

Results 931 comments of Maarten Grootendorst

Ah right, totally forgot about those 😅. Yes, you can use those to calculate the distances between documents and topics but these topic embeddings are just the unweighted average of...

@kjaksic We can use the `topic_model.topic_embeddings` to get the average document embedding for each topic. Although you can use that to compare distances, I would advise going with `topic_model.c_tf_idf` instead....

@kjaksic > Does this mean that the 0 index in the embedding matrix refers to the topic -1, that is, unclustered comments? Yes, the 0 index is indeed topic -1!

@kjaksic When you run `topic_model.topic_embeddings` what you get back is not the average document embedding for each topic. Although a topic consists of a number of documents, getting the average...

> Since inclusion of the time component (topics over time) in the model allows for the topic representation (top n words) to differ across the time, this should also affect...

> thank you for the answer. Would it be possible to extract c-TF-IDF matrix at different time points and multiply it with word embeddings of keywords (as the topic embedding...

It might be that the documents in that specific topic are (near-)empty which would also result in an empty topic. I would suggest exploring the documents on that topic to...

Thanks for sharing this! It is a known issue within the `sentence-transformers` framework which will soon be fixed and released. You can find a bit more about that [here](https://github.com/UKPLab/sentence-transformers/issues/1599). You...

It is not necessarily related to the parameter of the default number of topics given. Most likely it is related to the documents that make up the topic. If you...

> I see, thank you for your suggestion, I tried to check content of docs but they seem ok, but still I have empty tuples in topics To be on...