BERTopic
BERTopic copied to clipboard
cannot import BERTopic --> issue links to huggingface huggingface_hub repo
Firstly, thanks for making this library and opensource. It's really awesome.
I had used BERTopic before but today I couldn't run it after installing. It throwed below error. I tried to resolve but couldn't get beyond this. Hugging Face has made some changes to huggingface_hub (link). Any direction to resolve this?
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
~/GitHub/learning/l_ml/l_bertopic.ipynb Cell 1' in <cell line: 1>()
----> [1](vscode-notebook-cell:~/GitHub/test_bertopic/clustering/l_bertopic.ipynb#ch0000003?line=0) from bertopic import BERTopic
[2](vscode-notebook-cell:~/GitHub/test_bertopic/clustering/l_bertopic.ipynb#ch0000003?line=1) from sklearn.datasets import fetch_20newsgroups
[4](vscode-notebook-cell:~/GitHub/test_bertopic/clustering/l_bertopic.ipynb#ch0000003?line=3) docs = fetch_20newsgroups(subset='all', remove=('headers', 'footers', 'quotes'))['data']
File ~/.cache/pypoetry/virtualenvs/clustering-ho3GNO8w-py3.8/lib/python3.8/site-packages/bertopic/__init__.py:1, in <module>
----> 1 from bertopic._bertopic import BERTopic
3 __version__ = "0.10.0"
5 __all__ = [
6 "BERTopic",
7 ]
File ~/.cache/pypoetry/virtualenvs/clustering-ho3GNO8w-py3.8/lib/python3.8/site-packages/bertopic/_bertopic.py:31, in <module>
29 from bertopic._utils import MyLogger, check_documents_type, check_embeddings_shape, check_is_fitted
30 from bertopic._mmr import mmr
---> 31 from bertopic.backend._utils import select_backend
32 from bertopic import plotting
34 # Visualization
File ~/.cache/pypoetry/virtualenvs/clustering-ho3GNO8w-py3.8/lib/python3.8/site-packages/bertopic/backend/__init__.py:2, in <module>
1 from ._base import BaseEmbedder
----> 2 from ._word_doc import WordDocEmbedder
3 from ._utils import languages
...
(...)
418 use_auth_token: Union[bool, str, None] = None
419 ) -> str:
ImportError: cannot import name 'REPO_ID_SEPARATOR' from 'huggingface_hub.snapshot_download' (~/.cache/pypoetry/virtualenvs/clustering-ho3GNO8w-py3.8/lib/python3.8/site-packages/huggingface_hub/snapshot_download.py)
When looked into this I see huggingface has made snapshot_download.py
private as shown below.
# TODO: remove in 0.11
import warnings
warnings.warn(
"snapshot_download.py has been made private and will no longer be available from"
" version 0.11. Please use `from huggingface_hub import snapshot_download` to"
" import the only public function in this module. Other members of the file may be"
" changed without a deprecation notice.",
FutureWarning,
)
from ._snapshot_download import * # noqa
from .constants import REPO_ID_SEPARATOR # noqa
Thanks for sharing this! It is a known issue within the sentence-transformers
framework which will soon be fixed and released. You can find a bit more about that here. You can also install sentence-transformers
from the master branch where it is already fixed.