Error when pip install top2vec[sentence_encoders]
Hi! I tried installing pip install top2vec[sentence_encoders] in a Jupyter notebook environment running in Anaconda in Windows 11, and getting the following error:
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: top2vec[sentence_encoders] in c:\users\hiron\appdata\roaming\python\python311\site-packages (1.0.34)
Requirement already satisfied: numpy>=1.20.0 in c:\programdata\anaconda3\lib\site-packages (from top2vec[sentence_encoders]) (1.26.4)
Requirement already satisfied: pandas in c:\programdata\anaconda3\lib\site-packages (from top2vec[sentence_encoders]) (2.1.4)
Requirement already satisfied: scikit-learn>=1.2.0 in c:\programdata\anaconda3\lib\site-packages (from top2vec[sentence_encoders]) (1.2.2)
Requirement already satisfied: gensim>=4.0.0 in c:\programdata\anaconda3\lib\site-packages (from top2vec[sentence_encoders]) (4.3.0)
Requirement already satisfied: umap-learn>=0.5.1 in c:\users\hiron\appdata\roaming\python\python311\site-packages (from top2vec[sentence_encoders]) (0.5.6)
Requirement already satisfied: hdbscan>=0.8.27 in c:\users\hiron\appdata\roaming\python\python311\site-packages (from top2vec[sentence_encoders]) (0.8.33)
Requirement already satisfied: wordcloud in c:\users\hiron\appdata\roaming\python\python311\site-packages (from top2vec[sentence_encoders]) (1.9.3)
Collecting tensorflow (from top2vec[sentence_encoders])
Using cached tensorflow-2.16.1-cp311-cp311-win_amd64.whl.metadata (3.5 kB)
Collecting tensorflow-hub (from top2vec[sentence_encoders])
Using cached tensorflow_hub-0.16.1-py2.py3-none-any.whl.metadata (1.3 kB)
INFO: pip is looking at multiple versions of top2vec[sentence-encoders] to determine which version is compatible with other requirements. This could take a while.
Collecting top2vec[sentence_encoders]
Using cached top2vec-1.0.33-py3-none-any.whl.metadata (18 kB)
Using cached top2vec-1.0.32-py3-none-any.whl.metadata (18 kB)
Using cached top2vec-1.0.31-py3-none-any.whl.metadata (18 kB)
Using cached top2vec-1.0.30-py3-none-any.whl.metadata (18 kB)
Using cached top2vec-1.0.29-py3-none-any.whl.metadata (18 kB)
Using cached top2vec-1.0.28-py3-none-any.whl.metadata (18 kB)
Using cached top2vec-1.0.27-py3-none-any.whl.metadata (17 kB)
INFO: pip is still looking at multiple versions of top2vec[sentence-encoders] to determine which version is compatible with other requirements. This could take a while.
Using cached top2vec-1.0.26-py3-none-any.whl.metadata (17 kB)
Collecting gensim<4.0.0 (from top2vec[sentence_encoders])
Using cached gensim-3.8.3.tar.gz (23.4 MB)
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'done'
Collecting top2vec[sentence_encoders]
Using cached top2vec-1.0.25-py3-none-any.whl.metadata (17 kB)
Using cached top2vec-1.0.24-py3-none-any.whl.metadata (17 kB)
Using cached top2vec-1.0.23-py3-none-any.whl.metadata (17 kB)
Using cached top2vec-1.0.22-py3-none-any.whl.metadata (17 kB)
INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. See https://pip.pypa.io/warnings/backtracking for guidance. If you want to abort this run, press Ctrl + C.
Using cached top2vec-1.0.21-py3-none-any.whl.metadata (17 kB)
Requirement already satisfied: pynndescent>=0.4 in c:\users\hiron\appdata\roaming\python\python311\site-packages (from top2vec[sentence_encoders]) (0.5.12)
Collecting joblib<1.0.0 (from top2vec[sentence_encoders])
Using cached joblib-0.17.0-py3-none-any.whl.metadata (4.5 kB)
Collecting numpy (from top2vec[sentence_encoders])
Using cached numpy-1.19.2.zip (7.3 MB)
Installing build dependencies: started
Installing build dependencies: finished with status 'done'
Getting requirements to build wheel: started
Getting requirements to build wheel: finished with status 'done'
Preparing metadata (pyproject.toml): started
Preparing metadata (pyproject.toml): finished with status 'error'
error: subprocess-exited-with-error
Preparing metadata (pyproject.toml) did not run successfully.
exit code: 1
[93 lines of output]
Running from numpy source directory.
setup.py:470: UserWarning: Unrecognized setuptools command, proceeding with generating Cython sources and expanding templates
run_build = parse_setuppy_commands()
performance hint: _common.pyx:275:19: Exception check after calling 'random_func' will always require the GIL to be acquired. Declare 'random_func' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:299:19: Exception check after calling 'random_func' will always require the GIL to be acquired. Declare 'random_func' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:322:50: Exception check after calling 'random_func' will always require the GIL to be acquired. Declare 'random_func' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:426:31: Exception check after calling 'f' will always require the GIL to be acquired. Declare 'f' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:465:31: Exception check after calling 'f' will always require the GIL to be acquired. Declare 'f' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:509:31: Exception check after calling 'f' will always require the GIL to be acquired. Declare 'f' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:592:36: Exception check after calling 'f0' will always require the GIL to be acquired. Declare 'f0' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:596:36: Exception check after calling 'f1' will always require the GIL to be acquired. Declare 'f1' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:600:36: Exception check after calling 'f2' will always require the GIL to be acquired. Declare 'f2' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:604:36: Exception check after calling 'f3' will always require the GIL to be acquired. Declare 'f3' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:638:31: Exception check after calling 'f' will always require the GIL to be acquired. Declare 'f' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:675:31: Exception check after calling 'f' will always require the GIL to be acquired. Declare 'f' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:712:63: Exception check after calling 'f' will always require the GIL to be acquired. Declare 'f' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:754:31: Exception check after calling 'f' will always require the GIL to be acquired. Declare 'f' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:785:31: Exception check after calling 'f' will always require the GIL to be acquired. Declare 'f' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:903:40: Exception check after calling 'f0' will always require the GIL to be acquired. Declare 'f0' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:907:40: Exception check after calling 'fd' will always require the GIL to be acquired. Declare 'fd' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:911:41: Exception check after calling 'fdd' will always require the GIL to be acquired. Declare 'fdd' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:916:40: Exception check after calling 'fi' will always require the GIL to be acquired. Declare 'fi' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:920:41: Exception check after calling 'fdi' will always require the GIL to be acquired. Declare 'fdi' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:924:38: Exception check after calling 'fiii' will always require the GIL to be acquired. Declare 'fiii' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:960:31: Exception check after calling 'f' will always require the GIL to be acquired. Declare 'f' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _common.pyx:1002:32: Exception check after calling 'f1' will always require the GIL to be acquired. Declare 'f1' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
performance hint: _generator.pyx:707:41: Exception check after calling '_shuffle_int' will always require the GIL to be acquired.
Possible solutions:
1. Declare '_shuffle_int' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
2. Use an 'int' return type on '_shuffle_int' to allow an error code to be returned.
performance hint: _generator.pyx:736:45: Exception check after calling '_shuffle_int' will always require the GIL to be acquired.
Possible solutions:
1. Declare '_shuffle_int' as 'noexcept' if you control the definition and you're sure you don't want the function to raise exceptions.
2. Use an 'int' return type on '_shuffle_int' to allow an error code to be returned.
Error compiling Cython file:
------------------------------------------------------------
...
for i in range(1, RK_STATE_LEN):
self.rng_state.key[i] = val[i]
self.rng_state.pos = i
self._bitgen.state = &self.rng_state
self._bitgen.next_uint64 = &mt19937_uint64
^
------------------------------------------------------------
_mt19937.pyx:138:35: Cannot assign type 'uint64_t (*)(void *) except? -1 nogil' to 'uint64_t (*)(void *) noexcept nogil'. Exception values are incompatible. Suggest adding 'noexcept' to the type of the value being assigned.
Processing numpy/random\_bounded_integers.pxd.in
Processing numpy/random\bit_generator.pyx
Processing numpy/random\mtrand.pyx
Processing numpy/random\_bounded_integers.pyx.in
Processing numpy/random\_common.pyx
Processing numpy/random\_generator.pyx
Processing numpy/random\_mt19937.pyx
Traceback (most recent call last):
File "C:\Users\hiron\AppData\Local\Temp\pip-install-cc4dnroq\numpy_de640f75346a4cdfb93d66892a0eef7d\tools\cythonize.py", line 235, in <module>
main()
File "C:\Users\hiron\AppData\Local\Temp\pip-install-cc4dnroq\numpy_de640f75346a4cdfb93d66892a0eef7d\tools\cythonize.py", line 231, in main
find_process_files(root_dir)
File "C:\Users\hiron\AppData\Local\Temp\pip-install-cc4dnroq\numpy_de640f75346a4cdfb93d66892a0eef7d\tools\cythonize.py", line 222, in find_process_files
process(root_dir, fromfile, tofile, function, hash_db)
File "C:\Users\hiron\AppData\Local\Temp\pip-install-cc4dnroq\numpy_de640f75346a4cdfb93d66892a0eef7d\tools\cythonize.py", line 188, in process
processor_function(fromfile, tofile)
File "C:\Users\hiron\AppData\Local\Temp\pip-install-cc4dnroq\numpy_de640f75346a4cdfb93d66892a0eef7d\tools\cythonize.py", line 77, in process_pyx
subprocess.check_call(
File "C:\ProgramData\anaconda3\Lib\subprocess.py", line 413, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['C:\\ProgramData\\anaconda3\\python.exe', '-m', 'cython', '-3', '--fast-fail', '-o', '_mt19937.c', '_mt19937.pyx']' returned non-zero exit status 1.
Cythonizing sources
Traceback (most recent call last):
File "C:\ProgramData\anaconda3\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 353, in <module>
main()
File "C:\ProgramData\anaconda3\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 335, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 149, in prepare_metadata_for_build_wheel
return hook(metadata_directory, config_settings)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\hiron\AppData\Local\Temp\pip-build-env-64j7yvkv\overlay\Lib\site-packages\setuptools\build_meta.py", line 157, in prepare_metadata_for_build_wheel
self.run_setup()
File "C:\Users\hiron\AppData\Local\Temp\pip-build-env-64j7yvkv\overlay\Lib\site-packages\setuptools\build_meta.py", line 249, in run_setup
self).run_setup(setup_script=setup_script)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\hiron\AppData\Local\Temp\pip-build-env-64j7yvkv\overlay\Lib\site-packages\setuptools\build_meta.py", line 142, in run_setup
exec(compile(code, __file__, 'exec'), locals())
File "setup.py", line 499, in <module>
setup_package()
File "setup.py", line 479, in setup_package
generate_cython()
File "setup.py", line 274, in generate_cython
raise RuntimeError("Running cythonize failed!")
RuntimeError: Running cythonize failed!
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
Encountered error while generating package metadata.
See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
Any pointers will be much appreciated. Thanks!
Try using pip install top2vec\[sentence_encoders\] in command line.
Different error,
ERROR: Exception:
Traceback (most recent call last):
File "C:\ProgramData\anaconda3\Lib\site-packages\pip\_vendor\packaging\requirements.py", line 102, in __init__
req = REQUIREMENT.parseString(requirement_string)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\Lib\site-packages\pip\_vendor\pyparsing\util.py", line 256, in _inner
return fn(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\Lib\site-packages\pip\_vendor\pyparsing\core.py", line 1190, in parse_string
raise exc.with_traceback(None)
pip._vendor.pyparsing.exceptions.ParseException: Expected string_end, found '[' (at char 11), (line:1, col:12)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\ProgramData\anaconda3\Lib\site-packages\pip\_internal\cli\base_command.py", line 180, in exc_logging_wrapper
status = run_func(*args)
^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\Lib\site-packages\pip\_internal\cli\req_command.py", line 245, in wrapper
return func(self, options, args)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\Lib\site-packages\pip\_internal\commands\install.py", line 342, in run
reqs = self.get_requirements(args, options, finder, session)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\Lib\site-packages\pip\_internal\cli\req_command.py", line 411, in get_requirements
req_to_add = install_req_from_line(
^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\Lib\site-packages\pip\_internal\req\constructors.py", line 421, in install_req_from_line
parts = parse_req_from_line(name, line_source)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\Lib\site-packages\pip\_internal\req\constructors.py", line 358, in parse_req_from_line
extras = convert_extras(extras_as_string)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\Lib\site-packages\pip\_internal\req\constructors.py", line 58, in convert_extras
return get_requirement("placeholder" + extras.lower()).extras
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\Lib\site-packages\pip\_internal\utils\packaging.py", line 45, in get_requirement
return Requirement(req_string)
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\Lib\site-packages\pip\_vendor\packaging\requirements.py", line 104, in __init__
raise InvalidRequirement(
pip._vendor.packaging.requirements.InvalidRequirement: Parse error at "'[sentenc'": Expected string_end
I'm also having this issue running on Python 3.8, MacOS in a jupyter notebook inside of a venv. Running the suggested fixes does not seem to help.
UPDATE: Have tried this in a conda environment and it doesn't work there, either.
I can confirm that this is also present in Windows virtualenv Python versions 3.11 and 3.12 which I have tested independently, when using pip install top2vec[sentence_encoders], however the standard installation using pip install top2vec works fine.
If anyone wants to use the embedding models, then you can install these alternatively with pip install tensorflow tensorflow_hub torch sentence_transformers (I've omit tensorflow-text from this however you can try to include it).
Still running on the same issue after days of trying different fixes. python 3.9, I tried pip install top2vec, also installed the embedding models pip install top2vec[sentence_encoders] which was done successfully and also threw in the BERT transformers option pip install top2vec[sentence_encoders], also tried the alternative solution proposed by the error message pip install tensorflow tensorflow_hub tensorflow_text (said everything was already satifsfied fyi).
Restarted the machine and the conda env several times. Still have the same error:
`ImportError: universal-sentence-encoder is not available.
Try: pip install top2vec[sentence_encoders]
Alternatively try: pip install tensorflow tensorflow_hub tensorflow_text`