ann-benchmarks
ann-benchmarks copied to clipboard
Python 3.7 and fresh checkout from master: dependency installation issue
Hello! I've tried installing dependencies under Python 3.7.10 and got the following output. What Python version is supported / recommended?
Running setup.py install for numpy ... error
ERROR: Command errored out with exit status 1:
command: /Users/dmitrykan/project/ann-benchmarks/venv/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/setup.py'"'"'; __file__='"'"'/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-record-wrmxmqjz/install-record.txt --single-version-externally-managed --compile --install-headers /Users/dmitrykan/project/ann-benchmarks/venv/include/site/python3.7/numpy
cwd: /private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/
Complete output (195 lines):
Running from numpy source directory.
Note: if you need reliable uninstall behavior, then install
with pip instead of using `setup.py install`:
- `pip install .` (from a git repo or downloaded source
release)
- `pip install numpy` (last NumPy release on PyPi)
blas_opt_info:
blas_mkl_info:
libraries mkl_rt not found in ['/Users/dmitrykan/project/ann-benchmarks/venv/lib', '/usr/local/lib', '/usr/lib']
NOT AVAILABLE
blis_info:
libraries blis not found in ['/Users/dmitrykan/project/ann-benchmarks/venv/lib', '/usr/local/lib', '/usr/lib']
NOT AVAILABLE
openblas_info:
libraries openblas not found in ['/Users/dmitrykan/project/ann-benchmarks/venv/lib', '/usr/local/lib', '/usr/lib']
NOT AVAILABLE
atlas_3_10_blas_threads_info:
Setting PTATLAS=ATLAS
libraries tatlas not found in ['/Users/dmitrykan/project/ann-benchmarks/venv/lib', '/usr/local/lib', '/usr/lib']
NOT AVAILABLE
atlas_3_10_blas_info:
libraries satlas not found in ['/Users/dmitrykan/project/ann-benchmarks/venv/lib', '/usr/local/lib', '/usr/lib']
NOT AVAILABLE
atlas_blas_threads_info:
Setting PTATLAS=ATLAS
libraries ptf77blas,ptcblas,atlas not found in ['/Users/dmitrykan/project/ann-benchmarks/venv/lib', '/usr/local/lib', '/usr/lib']
NOT AVAILABLE
atlas_blas_info:
libraries f77blas,cblas,atlas not found in ['/Users/dmitrykan/project/ann-benchmarks/venv/lib', '/usr/local/lib', '/usr/lib']
NOT AVAILABLE
FOUND:
extra_compile_args = ['-msse3', '-I/System/Library/Frameworks/vecLib.framework/Headers']
extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
define_macros = [('NO_ATLAS_INFO', 3), ('HAVE_CBLAS', None)]
/bin/sh: svnversion: command not found
non-existing path in 'numpy/distutils': 'site.cfg'
/bin/sh: svnversion: command not found
F2PY Version 2
lapack_opt_info:
lapack_mkl_info:
libraries mkl_rt not found in ['/Users/dmitrykan/project/ann-benchmarks/venv/lib', '/usr/local/lib', '/usr/lib']
NOT AVAILABLE
openblas_lapack_info:
libraries openblas not found in ['/Users/dmitrykan/project/ann-benchmarks/venv/lib', '/usr/local/lib', '/usr/lib']
NOT AVAILABLE
atlas_3_10_threads_info:
Setting PTATLAS=ATLAS
libraries tatlas,tatlas not found in /Users/dmitrykan/project/ann-benchmarks/venv/lib
libraries lapack_atlas not found in /Users/dmitrykan/project/ann-benchmarks/venv/lib
libraries tatlas,tatlas not found in /usr/local/lib
libraries lapack_atlas not found in /usr/local/lib
libraries tatlas,tatlas not found in /usr/lib
libraries lapack_atlas not found in /usr/lib
<class 'numpy.distutils.system_info.atlas_3_10_threads_info'>
NOT AVAILABLE
atlas_3_10_info:
libraries satlas,satlas not found in /Users/dmitrykan/project/ann-benchmarks/venv/lib
libraries lapack_atlas not found in /Users/dmitrykan/project/ann-benchmarks/venv/lib
libraries satlas,satlas not found in /usr/local/lib
libraries lapack_atlas not found in /usr/local/lib
libraries satlas,satlas not found in /usr/lib
libraries lapack_atlas not found in /usr/lib
<class 'numpy.distutils.system_info.atlas_3_10_info'>
NOT AVAILABLE
atlas_threads_info:
Setting PTATLAS=ATLAS
libraries ptf77blas,ptcblas,atlas not found in /Users/dmitrykan/project/ann-benchmarks/venv/lib
libraries lapack_atlas not found in /Users/dmitrykan/project/ann-benchmarks/venv/lib
libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib
libraries lapack_atlas not found in /usr/local/lib
libraries ptf77blas,ptcblas,atlas not found in /usr/lib
libraries lapack_atlas not found in /usr/lib
<class 'numpy.distutils.system_info.atlas_threads_info'>
NOT AVAILABLE
atlas_info:
libraries f77blas,cblas,atlas not found in /Users/dmitrykan/project/ann-benchmarks/venv/lib
libraries lapack_atlas not found in /Users/dmitrykan/project/ann-benchmarks/venv/lib
libraries f77blas,cblas,atlas not found in /usr/local/lib
libraries lapack_atlas not found in /usr/local/lib
libraries f77blas,cblas,atlas not found in /usr/lib
libraries lapack_atlas not found in /usr/lib
<class 'numpy.distutils.system_info.atlas_info'>
NOT AVAILABLE
FOUND:
extra_compile_args = ['-msse3']
extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
define_macros = [('NO_ATLAS_INFO', 3), ('HAVE_CBLAS', None)]
/usr/local/Cellar/[email protected]/3.7.10_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/distutils/dist.py:274: UserWarning: Unknown distribution option: 'define_macros'
warnings.warn(msg)
running install
running build
running config_cc
unifing config_cc, config, build_clib, build_ext, build commands --compiler options
running config_fc
unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options
running build_src
build_src
building py_modules sources
creating build
creating build/src.macosx-11-x86_64-3.7
creating build/src.macosx-11-x86_64-3.7/numpy
creating build/src.macosx-11-x86_64-3.7/numpy/distutils
building library "npymath" sources
customize Gnu95FCompiler
Found executable /usr/local/bin/gfortran
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/setup.py", line 392, in <module>
setup_package()
File "/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/setup.py", line 384, in setup_package
setup(**metadata)
File "/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/numpy/distutils/core.py", line 169, in setup
return old_setup(**new_attr)
File "/Users/dmitrykan/project/ann-benchmarks/venv/lib/python3.7/site-packages/setuptools/__init__.py", line 153, in setup
return distutils.core.setup(**attrs)
File "/usr/local/Cellar/[email protected]/3.7.10_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/distutils/core.py", line 148, in setup
dist.run_commands()
File "/usr/local/Cellar/[email protected]/3.7.10_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/distutils/dist.py", line 966, in run_commands
self.run_command(cmd)
File "/usr/local/Cellar/[email protected]/3.7.10_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/numpy/distutils/command/install.py", line 62, in run
r = self.setuptools_run()
File "/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/numpy/distutils/command/install.py", line 36, in setuptools_run
return distutils_install.run(self)
File "/usr/local/Cellar/[email protected]/3.7.10_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/distutils/command/install.py", line 545, in run
self.run_command('build')
File "/usr/local/Cellar/[email protected]/3.7.10_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/usr/local/Cellar/[email protected]/3.7.10_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/numpy/distutils/command/build.py", line 47, in run
old_build.run(self)
File "/usr/local/Cellar/[email protected]/3.7.10_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/distutils/command/build.py", line 135, in run
self.run_command(cmd_name)
File "/usr/local/Cellar/[email protected]/3.7.10_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/usr/local/Cellar/[email protected]/3.7.10_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/numpy/distutils/command/build_src.py", line 148, in run
self.build_sources()
File "/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/numpy/distutils/command/build_src.py", line 159, in build_sources
self.build_library_sources(*libname_info)
File "/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/numpy/distutils/command/build_src.py", line 294, in build_library_sources
sources = self.generate_sources(sources, (lib_name, build_info))
File "/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/numpy/distutils/command/build_src.py", line 377, in generate_sources
source = func(extension, build_dir)
File "numpy/core/setup.py", line 672, in get_mathlib_info
st = config_cmd.try_link('int main(void) { return 0;}')
File "/usr/local/Cellar/[email protected]/3.7.10_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/distutils/command/config.py", line 243, in try_link
self._check_compiler()
File "/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/numpy/distutils/command/config.py", line 81, in _check_compiler
c_compiler=self.compiler)
File "/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/numpy/distutils/fcompiler/__init__.py", line 842, in new_fcompiler
c_compiler=c_compiler)
File "/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/numpy/distutils/fcompiler/__init__.py", line 816, in get_default_fcompiler
c_compiler=c_compiler)
File "/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/numpy/distutils/fcompiler/__init__.py", line 765, in _find_existing_fcompiler
c.customize(dist)
File "/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/numpy/distutils/fcompiler/__init__.py", line 521, in customize
linker_so_flags = self.flag_vars.linker_so
File "/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/numpy/distutils/environment.py", line 39, in __getattr__
return self._get_var(name, conf_desc)
File "/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/numpy/distutils/environment.py", line 53, in _get_var
var = self._hook_handler(name, hook)
File "/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/numpy/distutils/fcompiler/__init__.py", line 700, in _environment_hook
return hook()
File "/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/numpy/distutils/fcompiler/gnu.py", line 309, in get_flags_linker_so
flags = GnuFCompiler.get_flags_linker_so(self)
File "/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/numpy/distutils/fcompiler/gnu.py", line 138, in get_flags_linker_so
os.environ['MACOSX_DEPLOYMENT_TARGET'] = target
File "/usr/local/Cellar/[email protected]/3.7.10_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/os.py", line 686, in __setitem__
value = self.encodevalue(value)
File "/usr/local/Cellar/[email protected]/3.7.10_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/os.py", line 756, in encode
raise TypeError("str expected, not %s" % type(value).__name__)
TypeError: str expected, not int
----------------------------------------
ERROR: Command errored out with exit status 1: /Users/dmitrykan/project/ann-benchmarks/venv/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/setup.py'"'"'; __file__='"'"'/private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-install-l_rkp2wp/numpy_7ba5c87712f044d5af4cf340e9f6bc24/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /private/var/folders/2l/f04dpd917vx50cyl8fftcxzr0000gn/T/pip-record-wrmxmqjz/install-record.txt --single-version-externally-managed --compile --install-headers /Users/dmitrykan/project/ann-benchmarks/venv/include/site/python3.7/numpy Check the logs for full command output.
Python 3.6 is required for installing the current requirements.
On most setups I've tried with more recent versions of Python, just removing the pinned versions of the libraries worked fine.
The installation problems above seem to have to do with Numpy, not ann-benchmarks
Either way, it would be good to bump the Python version to 3.8 or ideally 3.9. I think 3.6 is ancient at this point.
thanks so much for your responses! I'll check lower Python version and report here.
Hi there, I'm having the a similar issue. However, I tried Python 3.6.13 and h5py requires 3.7+:
Collecting h5py==2.7.1 (from -r requirements.txt (line 3))
Using cached https://files.pythonhosted.org/packages/41/7a/6048de44c62fc5e618178ef9888850c3773a9e4be249e5e673ebce0402ff/h5py-2.7.1.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "/Users/nbrempel/.pyenv/versions/3.6.13/lib/python3.6/site-packages/setuptools/sandbox.py", line 154, in save_modules
yield saved
File "/Users/nbrempel/.pyenv/versions/3.6.13/lib/python3.6/site-packages/setuptools/sandbox.py", line 195, in setup_context
yield
File "/Users/nbrempel/.pyenv/versions/3.6.13/lib/python3.6/site-packages/setuptools/sandbox.py", line 250, in run_setup
_execfile(setup_script, ns)
File "/Users/nbrempel/.pyenv/versions/3.6.13/lib/python3.6/site-packages/setuptools/sandbox.py", line 45, in _execfile
exec(code, globals, locals)
File "/var/folders/cv/4kr757hn62gb0j3vdcm58jx40000gp/T/easy_install-z7ol4p_z/numpy-1.21.0/setup.py", line 34, in <module>
# RUN_REQUIRES can be removed when setup.py test is removed
RuntimeError: Python version >= 3.7 required.
(3.7 fails with other errors)
In my case, wheel was not available in my pyenv environment. Running pyenv exec pip install --upgrade pip setuptools wheel solved my problem.
took a while to come back to this -- @nrempel thanks for sharing a recipe that worked for you. I've just tried to bootstrap the project with 3.7 and got this in PyCharm:
the screenshot looks a bit strange, because matplotlib==2.1.0.
Above that message there is additional info for how to possibly proceed on this:
* The following required packages can not be built:
* freetype, png
* Try installing freetype with `brew install freetype`
* Try installing png with `brew install libpng`
Not sure if we need to pin the matplotlib version, feel free to use the latest and see if it works!
I tried to upgrade to all latest versions with Python 3.7. The following library versions installed correctly:
ansicolors==1.1.8
docker==2.6.1
h5py==3.3.0
matplotlib==3.4.2
numpy==1.21.0
pyyaml==5.4
psutil==5.6.6
scipy==1.7.0
scikit-learn==0.24.2
jinja2==2.10
Going to verify by building docker image and running it next.
With the versions above, I got these distribution of success/fail:
Install Status:
{'vespa': 'fail'}
{'elastiknn': 'fail'}
{'n2': 'success'}
{'flann': 'fail'}
{'pynndescent': 'fail'}
{'puffinn': 'success'}
{'annoy': 'success'}
{'hnswlib': 'success'}
{'scann': 'fail'}
{'nearpy': 'success'}
{'diskann_pq': 'fail'}
{'diskann': 'fail'}
{'opendistroknn': 'fail'}
{'faiss': 'success'}
{'sklearn': 'success'}
{'nmslib': 'success'}
{'elasticsearch': 'fail'}
{'rpforest': 'success'}
{'datasketch': 'success'}
{'kgraph': 'success'}
{'mih': 'success'}
{'milvus': 'success'}
{'dolphinn': 'success'}
{'sptag': 'success'}
{'mrpt': 'success'}
{'ngt': 'success'}
need to investigate further.
Just wanted to log things as I go -- sorry if this is the wrong thread (figured, I'd keep all in one place to avoid creating multiple tickets):
python run.py --algorithm kgraph
leads to:
2021-07-10 13:38:40,382 - annb - INFO - Order: [Definition(algorithm='kgraph', constructor='KGraph', module='ann_benchmarks.algorithms.kgraph', docker_tag='ann-benchmarks-kgraph', arguments=['angular', {'reverse': -1, 'K': 200, 'L': 300, 'S': 20}, False], query_argument_groups=[[1], [2], [3], [4], [5], [10], [20], [30], [40], [50], [60], [70], [80], [90], [100]], disabled=False)]
2021-07-10 13:38:42,486 - annb.2c5441a317 - INFO - Created container 2c5441a317: CPU limit 1, mem limit 5444025088, timeout 7200, command ['--dataset', 'glove-100-angular', '--algorithm', 'kgraph', '--module', 'ann_benchmarks.algorithms.kgraph', '--constructor', 'KGraph', '--runs', '5', '--count', '10', '["angular", {"reverse": -1, "K": 200, "L": 300, "S": 20}, false]', '[1]', '[2]', '[3]', '[4]', '[5]', '[10]', '[20]', '[30]', '[40]', '[50]', '[60]', '[70]', '[80]', '[90]', '[100]']
2021-07-10 13:38:58,253 - annb.2c5441a317 - INFO - Generating control...
2021-07-10 13:39:03,596 - annb.2c5441a317 - INFO - Initializing...
2021-07-10 13:39:08,414 - annb.2c5441a317 - ERROR - Generating control...
Initializing...
2021-07-10 13:39:08,416 - annb.2c5441a317 - ERROR - Child process for container 2c5441a317 raised exception 137
Is it possible to add more colour to exception 137?
@DmitryKey
The docker containers are still going to use python 3.6 if you didn't update the Dockerfile in https://github.com/erikbern/ann-benchmarks/blob/master/install/Dockerfile by using a more recent ubuntu release. You could use the old requirements.txt inside the docker containers (https://github.com/erikbern/ann-benchmarks/blob/master/install/Dockerfile#L8-L9) and another file locally to test whether this is the problem.
thanks @maumueller ! I've upgraded the common Dockerfile to ubuntu 20.04 and adjusted the python installation instructions:
-FROM ubuntu:18.04
+FROM ubuntu:20.04
RUN apt-get update
-RUN apt-get install -y python3-numpy python3-scipy python3-pip build-essential git
+RUN apt-get install python3.7
+RUN DEBIAN_FRONTEND="noninteractive" apt-get -y install python3-numpy python3-scipy python3-pip build-essential git
RUN pip3 install -U pip
next, I had to modify the sptag's Dockerfile:
RUN apt-get update && DEBIAN_FRONTEND="noninteractive" apt-get -y install wget build-essential libtbb-dev software-properties-common swig
Running the algorithm with python run.py --algorithm sptag begins normally, but after a while I'm getting:
2021-07-11 16:13:27,517 - annb.3ed2396351 - INFO - [4] Hash table is full! Set HashTableExponent to larger value (default is 2). NewHashTableExponent=3 NewPoolSize=131071
2021-07-11 16:13:29,093 - annb.3ed2396351 - INFO - [4] Hash table is full! Set HashTableExponent to larger value (default is 2). NewHashTableExponent=3 NewPoolSize=131071
2021-07-11 16:13:29,395 - annb.3ed2396351 - INFO - [4] Hash table is full! Set HashTableExponent to larger value (default is 2). NewHashTableExponent=3 NewPoolSize=131071
2021-07-11 16:13:43,022 - annb.3ed2396351 - INFO - [4] Hash table is full! Set HashTableExponent to larger value (default is 2). NewHashTableExponent=3 NewPoolSize=131071
2021-07-11 16:14:00,279 - annb.3ed2396351 - INFO - [4] Hash table is full! Set HashTableExponent to larger value (default is 2). NewHashTableExponent=3 NewPoolSize=131071
2021-07-11 18:14:07,082 - annb.3ed2396351 - ERROR - Container.wait for container 3ed2396351 failed with exception
Traceback (most recent call last):
File "/Users/dmitrykan/search/vs/ann-benchmarks/venv/lib/python3.7/site-packages/urllib3/response.py", line 438, in _error_catcher
yield
File "/Users/dmitrykan/search/vs/ann-benchmarks/venv/lib/python3.7/site-packages/urllib3/response.py", line 764, in read_chunked
self._update_chunk_length()
File "/Users/dmitrykan/search/vs/ann-benchmarks/venv/lib/python3.7/site-packages/urllib3/response.py", line 694, in _update_chunk_length
line = self._fp.fp.readline()
File "/usr/local/Cellar/[email protected]/3.7.11/Frameworks/Python.framework/Versions/3.7/lib/python3.7/socket.py", line 589, in readinto
return self._sock.recv_into(b)
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/dmitrykan/search/vs/ann-benchmarks/venv/lib/python3.7/site-packages/requests/models.py", line 753, in generate
for chunk in self.raw.stream(chunk_size, decode_content=True):
File "/Users/dmitrykan/search/vs/ann-benchmarks/venv/lib/python3.7/site-packages/urllib3/response.py", line 572, in stream
for line in self.read_chunked(amt, decode_content=decode_content):
File "/Users/dmitrykan/search/vs/ann-benchmarks/venv/lib/python3.7/site-packages/urllib3/response.py", line 793, in read_chunked
self._original_response.close()
File "/usr/local/Cellar/[email protected]/3.7.11/Frameworks/Python.framework/Versions/3.7/lib/python3.7/contextlib.py", line 130, in __exit__
self.gen.throw(type, value, traceback)
File "/Users/dmitrykan/search/vs/ann-benchmarks/venv/lib/python3.7/site-packages/urllib3/response.py", line 443, in _error_catcher
raise ReadTimeoutError(self._pool, None, "Read timed out.")
urllib3.exceptions.ReadTimeoutError: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/dmitrykan/search/vs/ann-benchmarks/ann_benchmarks/runner.py", line 258, in run_docker
exit_code = container.wait(timeout=timeout)
File "/Users/dmitrykan/search/vs/ann-benchmarks/venv/lib/python3.7/site-packages/docker/models/containers.py", line 441, in wait
return self.client.api.wait(self.id, **kwargs)
File "/Users/dmitrykan/search/vs/ann-benchmarks/venv/lib/python3.7/site-packages/docker/utils/decorators.py", line 19, in wrapped
return f(self, resource_id, *args, **kwargs)
File "/Users/dmitrykan/search/vs/ann-benchmarks/venv/lib/python3.7/site-packages/docker/api/container.py", line 1257, in wait
res = self._post(url, timeout=timeout)
File "/Users/dmitrykan/search/vs/ann-benchmarks/venv/lib/python3.7/site-packages/docker/utils/decorators.py", line 46, in inner
return f(self, *args, **kwargs)
File "/Users/dmitrykan/search/vs/ann-benchmarks/venv/lib/python3.7/site-packages/docker/api/client.py", line 187, in _post
return self.post(url, **self._set_request_timeout(kwargs))
File "/Users/dmitrykan/search/vs/ann-benchmarks/venv/lib/python3.7/site-packages/requests/sessions.py", line 590, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/Users/dmitrykan/search/vs/ann-benchmarks/venv/lib/python3.7/site-packages/requests/sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "/Users/dmitrykan/search/vs/ann-benchmarks/venv/lib/python3.7/site-packages/requests/sessions.py", line 697, in send
r.content
File "/Users/dmitrykan/search/vs/ann-benchmarks/venv/lib/python3.7/site-packages/requests/models.py", line 831, in content
self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b''
File "/Users/dmitrykan/search/vs/ann-benchmarks/venv/lib/python3.7/site-packages/requests/models.py", line 760, in generate
raise ConnectionError(e)
requests.exceptions.ConnectionError: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out.
if you have ideas, what to check, please let me know.
Pasting the running command in full, just in case:
(venv) dmitrykan@Dmitrys-MacBook-Pro ann-benchmarks % python run.py --algorithm sptag
2021-07-11 15:44:58,905 - annb - INFO - running only sptag
2021-07-11 15:44:59,851 - annb - INFO - Order: [Definition(algorithm='sptag', constructor='Sptag', module='ann_benchmarks.algorithms.sptag', docker_tag='ann-benchmarks-sptag', arguments=['angular', 'KDT'], query_argument_groups=[[100], [200], [400], [1000], [2000], [4000]], disabled=False), Definition(algorithm='sptag', constructor='Sptag', module='ann_benchmarks.algorithms.sptag', docker_tag='ann-benchmarks-sptag', arguments=['angular', 'BKT'], query_argument_groups=[[100], [200], [400], [1000], [2000], [4000]], disabled=False)]
2021-07-11 15:45:00,773 - annb.3ed2396351 - INFO - Created container 3ed2396351: CPU limit 1, mem limit 6130084608, timeout 7200, command ['--dataset', 'glove-100-angular', '--algorithm', 'sptag', '--module', 'ann_benchmarks.algorithms.sptag', '--constructor', 'Sptag', '--runs', '5', '--count', '10', '["angular", "KDT"]', '[100]', '[200]', '[400]', '[1000]', '[2000]', '[4000]']
Some of the algorithms will time out, that's normal.
I also wouldn't expect all the images to build properly. There's always some issues with a few of them.
if you get it working with Ubuntu 20.04 and Python 3.7 (or higher), I would love it if you can submit a pull request.
@erikbern thanks! Actually I'm thinking it could be better to always use a specific release of each algorithm (where available) -- have you considered this? For instance, I was just trying to compile faiss and can see that it tries to pull certain resource from Python 3.8:
> [8/8] RUN python3 -c 'import faiss; print(faiss.IndexFlatL2)':
#10 0.711 Traceback (most recent call last):
#10 0.711 File "<string>", line 1, in <module>
#10 0.711 File "<frozen zipimport>", line 259, in load_module
#10 0.711 File "/usr/local/lib/python3.8/dist-packages/faiss-1.7.1-py3.8.egg/faiss/__init__.py", line 18, in <module>
#10 0.711 File "<frozen zipimport>", line 259, in load_module
#10 0.711 File "/usr/local/lib/python3.8/dist-packages/faiss-1.7.1-py3.8.egg/faiss/loader.py", line 65, in <module>
#10 0.711 File "<frozen zipimport>", line 259, in load_module
#10 0.711 File "/usr/local/lib/python3.8/dist-packages/faiss-1.7.1-py3.8.egg/faiss/swigfaiss.py", line 13, in <module>
#10 0.711 ImportError: cannot import name '_swigfaiss' from 'faiss' (/usr/local/lib/python3.8/dist-packages/faiss-1.7.1-py3.8.egg/faiss/__init__.py)
------
executor failed running [/bin/sh -c python3 -c 'import faiss; print(faiss.IndexFlatL2)']: exit code: 1
this might be originating from ubuntu 20.04 itself, but it just occurred to me that having a reproducible "compilability" would be a big boost to usability.
I fixed versions for the reproducibility setup for https://arxiv.org/abs/1807.05614 (e.g., https://github.com/maumueller/ann-benchmarks-reproducibility/blob/master/install/Dockerfile.ngt#L7) but I find it hard to imagine that developers will update these versions. Using the most recent version (with a chance of failing) seems more robust in terms of presenting up-to-date results.
I'm torn about it – the benefit of pinning versions is that things will be more stable, but the drawback is that we'll use outdated versions during benchmarks. I think to some extent the onus over time could be on the library developers to make sure the latest version builds and runs correctly (eg the FAISS developers seem quite eager to push updates to ann-benchmarks) but I'm not sure if the "market power" is quite there for this to work more generally.
@maumueller I agree -- may be not the developers of the specific algorithm, but developers of ann-benchmarks could have the versions fixed -- and I see you did that in Milvus's case -- I've had issues compiling it with Python 3.7, but it worked with Python 3.6. Here is the full list of algos that compiled (some of them still failed, like diskann):
{'vespa': 'success'}
{'elastiknn': 'fail'}
{'n2': 'success'}
{'flann': 'fail'}
{'pynndescent': 'success'}
{'puffinn': 'success'}
{'annoy': 'success'}
{'hnswlib': 'success'}
{'scann': 'fail'}
{'nearpy': 'success'}
{'diskann_pq': 'fail'}
{'diskann': 'fail'}
{'opendistroknn': 'fail'}
{'faiss': 'success'}
{'sklearn': 'success'}
{'nmslib': 'success'}
{'elasticsearch': 'fail'}
{'rpforest': 'success'}
{'datasketch': 'success'}
{'kgraph': 'success'}
{'mih': 'success'}
{'milvus': 'success'}
{'dolphinn': 'success'}
{'sptag': 'success'}
{'mrpt': 'success'}
{'ngt': 'success'}
I'm conversing with Milvus developers on fixing the issue with compiling their latest v1.1.1 release. If this is successful, will submit a PR.
@erikbern yes, you are right. It is quite a task to ask the whole ANN community to timely push updates. However, I was impressed to see that this repo is cited in google research github. Great job!
Hey guys, I wanted to ask for your advice: not sure if something is misconfigured on my side, but some of the algorithms run into timeout issue.
Here is one example:
2021-07-23 15:58:24,017 - annb.17c6b01171 - INFO - Created container 17c6b01171: CPU limit 1, mem limit 9590553344, timeout 7200, command ['--dataset', 'glove-100-angular', '--algorithm', 'sptag', '--module', 'ann_benchmarks.algorithms.sptag', '--constructor', 'Sptag', '--runs', '5', '--count', '10', '["angular", "KDT"]', '[100]', '[200]', '[400]', '[1000]', '[2000]', '[4000]']
So the SPTag runs for a few hours and the last several lines are:
2021-07-23 17:21:15,385 - annb.17c6b01171 - INFO - [4] Hash table is full! Set HashTableExponent to larger value (default is 2). NewHashTableExponent=3 NewPoolSize=131071
2021-07-23 21:47:11,345 - annb.17c6b01171 - ERROR - Container.wait for container 17c6b01171 failed with exception
Traceback (most recent call last):
File "/Users/dmitry/projects/github/vs/ann-benchmarks/venv/lib/python3.6/site-packages/urllib3/response.py", line 438, in _error_catcher
yield
File "/Users/dmitry/projects/github/vs/ann-benchmarks/venv/lib/python3.6/site-packages/urllib3/response.py", line 764, in read_chunked
self._update_chunk_length()
File "/Users/dmitry/projects/github/vs/ann-benchmarks/venv/lib/python3.6/site-packages/urllib3/response.py", line 694, in _update_chunk_length
line = self._fp.fp.readline()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/socket.py", line 586, in readinto
return self._sock.recv_into(b)
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/dmitry/projects/github/vs/ann-benchmarks/venv/lib/python3.6/site-packages/requests/models.py", line 758, in generate
for chunk in self.raw.stream(chunk_size, decode_content=True):
File "/Users/dmitry/projects/github/vs/ann-benchmarks/venv/lib/python3.6/site-packages/urllib3/response.py", line 572, in stream
for line in self.read_chunked(amt, decode_content=decode_content):
File "/Users/dmitry/projects/github/vs/ann-benchmarks/venv/lib/python3.6/site-packages/urllib3/response.py", line 793, in read_chunked
self._original_response.close()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/contextlib.py", line 99, in __exit__
self.gen.throw(type, value, traceback)
File "/Users/dmitry/projects/github/vs/ann-benchmarks/venv/lib/python3.6/site-packages/urllib3/response.py", line 443, in _error_catcher
raise ReadTimeoutError(self._pool, None, "Read timed out.")
urllib3.exceptions.ReadTimeoutError: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out.
Is there anything to tweak on docker / os side? I will try to prevent my OS from going to sleep to see if this helps.
I tweaked the sleeping schedule of my machine -- still same issue. Will upgrade docker next from 3.3.1 to more recent one.
Also noticed stable ram insufficiency for Puffin algorithm. Is this a known issue?
2021-07-27 19:25:31,506 - annb.408f02c65f - INFO - Created container 408f02c65f: CPU limit 1, mem limit 9475463936, timeout 7200, command ['--dataset', 'glove-100-angular', '--algorithm', 'puffinn', '--module', 'ann_benchmarks.algorithms.puffinn', '--constructor', 'Puffinn', '--runs', '5', '--count', '10', '["angular", 268435456, "fht_crosspolytope"]', '[0.1]', '[0.2]', '[0.5]', '[0.7]', '[0.9]', '[0.95]', '[0.99]']
2021-07-27 19:26:13,949 - annb.408f02c65f - INFO - ['angular', 268435456, 'fht_crosspolytope']
2021-07-27 19:26:13,949 - annb.408f02c65f - INFO - Trying to instantiate ann_benchmarks.algorithms.puffinn.Puffinn(['angular', 268435456, 'fht_crosspolytope'])
2021-07-27 19:26:13,950 - annb.408f02c65f - INFO - got a train set of size (1183514 * 100)
2021-07-27 19:26:13,951 - annb.408f02c65f - INFO - got 10000 queries
2021-07-27 19:26:14,120 - annb.408f02c65f - INFO - Traceback (most recent call last):
2021-07-27 19:26:14,121 - annb.408f02c65f - INFO - File "run_algorithm.py", line 3, in <module>
2021-07-27 19:26:14,122 - annb.408f02c65f - INFO - run_from_cmdline()
2021-07-27 19:26:14,122 - annb.408f02c65f - INFO - File "/home/app/ann_benchmarks/runner.py", line 211, in run_from_cmdline
2021-07-27 19:26:14,123 - annb.408f02c65f - INFO - run(definition, args.dataset, args.count, args.runs, args.batch)
2021-07-27 19:26:14,124 - annb.408f02c65f - INFO - File "/home/app/ann_benchmarks/runner.py", line 122, in run
2021-07-27 19:26:14,124 - annb.408f02c65f - INFO - algo.fit(X_train)
2021-07-27 19:26:14,125 - annb.408f02c65f - INFO - File "/home/app/ann_benchmarks/algorithms/puffinn.py", line 35, in fit
2021-07-27 19:26:14,126 - annb.408f02c65f - INFO - self.index.rebuild()
2021-07-27 19:26:14,126 - annb.408f02c65f - INFO - ValueError: insufficient memory
2021-07-27 19:26:18,382 - annb.408f02c65f - ERROR - ['angular', 268435456, 'fht_crosspolytope']
Trying to instantiate ann_benchmarks.algorithms.puffinn.Puffinn(['angular', 268435456, 'fht_crosspolytope'])
got a train set of size (1183514 * 100)
got 10000 queries
Traceback (most recent call last):
File "run_algorithm.py", line 3, in <module>
run_from_cmdline()
File "/home/app/ann_benchmarks/runner.py", line 211, in run_from_cmdline
run(definition, args.dataset, args.count, args.runs, args.batch)
File "/home/app/ann_benchmarks/runner.py", line 122, in run
algo.fit(X_train)
File "/home/app/ann_benchmarks/algorithms/puffinn.py", line 35, in fit
self.index.rebuild()
ValueError: insufficient memory
Hi @DmitryKey. Sorry for not following up on this further!
I think both the timeouts and memory problems are known issues. Did you notice any additional problems updating to Python 3.7. It seems necessary to me that we try to bump everything up to 3.7 or (better) 3.8.