Olivier Bougeant
Olivier Bougeant
Thanks for the update @cjnolet! I think it's quite clear to me that we're not going to get exactly the same results with cuML's UMAP and the CPU-based umap.UMAP. However,...
Thanks @MaartenGr for your detailed answer. I will take a look at other clustering algorithms to see if these issues persist.
@MaartenGr, As of bertopic `0.16.2`, it seems that the probabilities returned in case of zeroshot are actually cosine similarities, not real probabilities. This can be seen in @amitca71's example above,...
Thanks for your reply @MaartenGr. This behaviour is indeed not specific to 0.16.2. I guess I was hoping I was doing something wrong and could get probabilities out of a...
Thanks for the clarification @MaartenGr. This is very helpful.
I have the same error.
Sure! Here goes: `pip install bertopic==0.16.1 datasets` ``` import logging import pandas as pd import spacy from sklearn.datasets import fetch_20newsgroups from bertopic import BERTopic from bertopic.representation import PartOfSpeech from sentence_transformers...