annoy
annoy copied to clipboard
Initializing an AnnoyIndex crashes on AMD processors
I was trying to get a pip-installed Annoy v1.16.3 running on an m5a.large AWS Instance,
but it crashed with Illegal instruction (core dumped)
trying to execute
annoy.AnnoyIndex(512, "angular")
This doesn't occur on Intel-based m5.large Instances.
Doing some digging (details below), it seems to be because Annoy requires the AVX-512 instruction set, which isn't available on AMD processors.
m5a.large's AMD EPYC 7000 series processor supports these instruction sets (from /proc/cpuinfo
):
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch topoext vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr arat npt nrip_save
The Annoy library seems to use these instruction sets (using this utility):
MODE64 (call)
AVX512 (vmovdqa64)
VLX (vmovdqa64)
AVX2 (vextracti128)
AVX (vmovups)
NOVLX (vmovups)
CMOV (cmovns)
CPU Generation: Unknown
Did you build it from source?
I did not!
ok, how did you install it then? if it was compiled elsewhere then it probably explains it
We build in a different instance than we run, which might end up in a different instance type due to our mix. Is there an option to relax the compilation so that we don't limit to the architecture it was compiled in? Something like using mtune instead of march. I'm not familiar with annoy's compilation process.
Seems likely that would work. Maybe you can add something to setup.py that lets you control the flag using an environment variable or similar?
Any news on this? im having the same issue.
sounds like maybe moving the -march=native
flag should do the trick. can you try that and let me know? i might remove it by default if it's causing issues for many people
I don't know about that flag, how should i pass it on to the installer?
We created a wrapper around gcc/g++ to replace march
to mtune
. Just updating the setup.py
in this repo wouldn't be enough for our case since we need to support existing versions of annoy.
i'll take a look at changing march to mtune in setup.py. just want to make sure there's no major performance implication.
As a temporary fix, i ended up installing it on the same instance i use it, instead of on a different instance (where my docker image was being built).
I would recommend leaving mtune/march as an option if possible (maybe an environment flag?). I see value of locking with march in cases where you want to optimize for performance vs mtune where you might be optimizing for other characteristics such as availability. march
would be a good choice for a lot of dev environments, while mtune
could be necessary for production to spread into multiple machine pools in a heterogenous cluster (our case).
you can actually set the ANNOY_COMPILER_ARGS
environment variable to whatever flags you want! it's not documented. see https://github.com/spotify/annoy/blob/master/setup.py
Oh! That is indeed a much better solution. Somehow I completely missed it when I skimmed the code.
Oh! That is indeed a much better solution. Somehow I completely missed it when I skimmed the code.
Can you share how you'd install annoy to avoid this issue?
This is from setup.py:
# Not all CPUs have march as a tuning parameter
1 cputune = ['-march=native',]
2 if platform.machine() == 'ppc64le':
3 extra_compile_args += ['-mcpu=native',]
4
5 if platform.machine() == 'x86_64': 6
6 extra_compile_args += cputune
7
8 if os.name != 'nt':
9 extra_compile_args += ['-O3', '-ffast-math', '-fno-associative-math']
Do I comment lines 1-6 or just delete the variable inside the brackets?
Also, I tried to run setup.py and there are command line arguments? I think they are required, so I can't get past the argparser. Do we have to provide those command line arguments?
...update: So since I last wrote, I cloned repo, commented lines 1-6 for now in setup.py
, cd into annoy source folder
and did pip install .
, gonna see if I can run the rest of my scripts and build the index without getting and illegal instruction error
Ok, thanks for experimenting with it! I'm going to see if there's a better solution for this. Enough people have been running into issues for me to think maybe I should consider a different approach. I might give it a shot to just use Eigen for dot products, or something similar.
Okay, so in case this helps anyone else, steps I took to by-pass the 'illegal instruction' error:
- activate your virtenv (
source ~/ann/bin/activate
) - git clone annoy repo
- `cd annoy/'
- `vim setup.py' Inside setup.py I comment out lines 1-6 from the file (NOTE: line #'s here do not correspond to the line #'s in the actual setup.py file I just did it below for ease of understanding)
# Not all CPUs have march as a tuning parameter
1 cputune = ['-march=native',]
2 if platform.machine() == 'ppc64le':
3 extra_compile_args += ['-mcpu=native',]
4
5 if platform.machine() == 'x86_64': 6
6 extra_compile_args += cputune
7
8 if os.name != 'nt':
9 extra_compile_args += ['-O3', '-ffast-math', '-fno-associative-math']
-
:wq
-
pip install .
- delete repo you don't need it anymore (ensure that you did get a confirmation message that it did install though)
I used .build_on_disk() and was able to build an annoy index successfully. Will keep you updated if something goes wrong.
Great! I'll investigate if there's some way to fix the underlying issue
This may be the problem I have been seeing on our setup as well. Everything is built in docker images, so originally I wasn't sure why I was getting failures. And since Annoy is integrated into our sagemaker pipeline, failures were some what opaque.
Plus, images built in our CI/CD pipeline were the only instances which would cause the error, but local docker builds would be fine. I'll experiment with these flags to see if the issue is resolved. Side note, the only reason why I guess it was a problem with annoy was comparing the two docker images with https://github.com/GoogleContainerTools/container-diff.
For anyone that has issues with this and builds annoy in docker, I used this line to override the ANNOY_COMPILER_ARGS
environment variable per @erikbern's suggestion. Thanks!
ENV ANNOY_COMPILER_ARGS -D_CRT_SECURE_NO_WARNINGS,-DANNOYLIB_MULTITHREADED_BUILD,-mtune=native
If anyone wants to give it a shot and try to fix this, I think the best approach is something like
- Make the distance function calculation a template argument of
AnnoyIndex
(but notAnnoyIndexInterface
) - At the point of instantiation of the index https://github.com/spotify/annoy/blob/master/src/annoymodule.cc#L153 – add (a) and #ifdef that checks if AVX is supported (b) checks
if (__builtin_cpu_supports("avx2"))
in runtime
Maybe there's other solutions too!
Yes, I know for a fact our University's cluster of CPUs does not support avx2, only regular avx, and I think this is where that error stems from also (I'm assuming that differentiation between avx2 and avx is also significant).
These are some other erros I have come accross along side trying to find a workaround for illegal instruction:
1)RuntimeError: Didn't find engine for operation quantized::linear_prepack NoQEngine (operator() at /pytorch/aten/src/ATen/native/quantized/cpu/qlinear_prepack.cpp:202)
2)RuntimeError: Your CPU does not support FBGEMM
"FBGEMM requires gcc 5+ and a CPU with support for avx2 instruction set or higher." -->> from googling this issue
My guess would be that you need to disable FBGEMM, at least when you run on architectures that don't support avx2. There appears to be a USE_FBGEMM environment variable which can be set to "0" to disable the use of FBGEMM.
are those last ones related to Annoy? seems like it says something about pytorch
I remember experiencing it as one of the many errors during the illegal instruction issue I was having, but you're right... this might not be particular to that and maybe more to do my virtenv. Not sure, thought I'd mention it.
For anyone that has issues with this and builds annoy in docker, I used this line to override the
ANNOY_COMPILER_ARGS
environment variable per @erikbern's suggestion. Thanks!
ENV ANNOY_COMPILER_ARGS -D_CRT_SECURE_NO_WARNINGS,-DANNOYLIB_MULTITHREADED_BUILD,-mtune=native
Yeahhhhhhh! You saved my mental health 😃 Solution Working on AWS ECS
Hello may i know the exact line that you used to install annoy in docker? Not sure how to pass this argument into my dockerfile
Hello may i know the exact line that you used to install annoy in docker? Not sure how to pass this argument into my dockerfile
It has been a few years, and there is probably a better way to do this now. But originally included these in my dockerfile
FROM python:3.7.5-slim-stretch
ENV ANNOY_COMPILER_ARGS -D_CRT_SECURE_NO_WARNINGS,-DANNOYLIB_MULTITHREADED_BUILD,-mtune=native
From there you can include anything else in your dockerfile including installing annoy.
@mmaybeno is there a better way to do this now?