BERTopic icon indicating copy to clipboard operation
BERTopic copied to clipboard

After inference BerTopic predicting Noisy topic(-1 topic)

Open NShweta19 opened this issue 2 years ago • 1 comments

Hi, My transform function predicting Noisy topic(-1 topic) for most of the new data.

Your support for this is appreciated.

Thanks, Shweta

NShweta19 avatar Jul 15 '22 11:07 NShweta19

Without a bit more context, it is difficult to say what exactly is happening here. How many documents are you using for fitting and how many predicting? How many topics were created when fitting the model? Also, how did you train BERTopic?

Having said that, -1 topics tend to be generated a bit more when using HDBSCAN. A way of minimizing that is following along with the FAQ here. Many of these steps are also relevant for using the transform function.

Hopefully, this helps a bit!

MaartenGr avatar Jul 15 '22 17:07 MaartenGr

Due to inactivity, I'll be closing this for now. Let me know if you have any other questions related to this and I'll make sure to re-open the issue!

MaartenGr avatar Sep 27 '22 08:09 MaartenGr