annoy icon indicating copy to clipboard operation
annoy copied to clipboard

Initializing an AnnoyIndex crashes on AMD processors

Open convoliution opened this issue 4 years ago • 35 comments

I was trying to get a pip-installed Annoy v1.16.3 running on an m5a.large AWS Instance, but it crashed with Illegal instruction (core dumped) trying to execute

annoy.AnnoyIndex(512, "angular")

This doesn't occur on Intel-based m5.large Instances.

Doing some digging (details below), it seems to be because Annoy requires the AVX-512 instruction set, which isn't available on AMD processors.


m5a.large's AMD EPYC 7000 series processor supports these instruction sets (from /proc/cpuinfo):

fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch topoext vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr arat npt nrip_save

The Annoy library seems to use these instruction sets (using this utility):

    MODE64 (call)
    AVX512 (vmovdqa64)
    VLX (vmovdqa64)
    AVX2 (vextracti128)
    AVX (vmovups)
    NOVLX (vmovups)
    CMOV (cmovns)
CPU Generation: Unknown

convoliution avatar Apr 02 '20 23:04 convoliution

Did you build it from source?

erikbern avatar Apr 03 '20 00:04 erikbern

I did not!

convoliution avatar Apr 03 '20 01:04 convoliution

ok, how did you install it then? if it was compiled elsewhere then it probably explains it

erikbern avatar Apr 03 '20 02:04 erikbern

We build in a different instance than we run, which might end up in a different instance type due to our mix. Is there an option to relax the compilation so that we don't limit to the architecture it was compiled in? Something like using mtune instead of march. I'm not familiar with annoy's compilation process.

conradoverta avatar Apr 03 '20 15:04 conradoverta

Seems likely that would work. Maybe you can add something to setup.py that lets you control the flag using an environment variable or similar?

erikbern avatar Apr 03 '20 15:04 erikbern

Any news on this? im having the same issue.

conradogarciaberrotaran avatar May 05 '20 18:05 conradogarciaberrotaran

sounds like maybe moving the -march=native flag should do the trick. can you try that and let me know? i might remove it by default if it's causing issues for many people

erikbern avatar May 06 '20 11:05 erikbern

I don't know about that flag, how should i pass it on to the installer?

conradogarciaberrotaran avatar May 06 '20 13:05 conradogarciaberrotaran

We created a wrapper around gcc/g++ to replace march to mtune. Just updating the setup.py in this repo wouldn't be enough for our case since we need to support existing versions of annoy.

conradoverta avatar May 06 '20 19:05 conradoverta

i'll take a look at changing march to mtune in setup.py. just want to make sure there's no major performance implication.

erikbern avatar May 06 '20 21:05 erikbern

As a temporary fix, i ended up installing it on the same instance i use it, instead of on a different instance (where my docker image was being built).

conradogarciaberrotaran avatar May 06 '20 21:05 conradogarciaberrotaran

I would recommend leaving mtune/march as an option if possible (maybe an environment flag?). I see value of locking with march in cases where you want to optimize for performance vs mtune where you might be optimizing for other characteristics such as availability. march would be a good choice for a lot of dev environments, while mtune could be necessary for production to spread into multiple machine pools in a heterogenous cluster (our case).

conradoverta avatar May 06 '20 22:05 conradoverta

you can actually set the ANNOY_COMPILER_ARGS environment variable to whatever flags you want! it's not documented. see https://github.com/spotify/annoy/blob/master/setup.py

erikbern avatar May 08 '20 03:05 erikbern

Oh! That is indeed a much better solution. Somehow I completely missed it when I skimmed the code.

conradoverta avatar May 08 '20 03:05 conradoverta

Oh! That is indeed a much better solution. Somehow I completely missed it when I skimmed the code.

Can you share how you'd install annoy to avoid this issue?

conradogarciaberrotaran avatar May 11 '20 14:05 conradogarciaberrotaran

This is from setup.py:

# Not all CPUs have march as a tuning parameter                                                                   
1 cputune = ['-march=native',]                                                                                      
2 if platform.machine() == 'ppc64le':                                                                               
3     extra_compile_args += ['-mcpu=native',]                                                                       
4                                                                                                                  
5 if platform.machine() == 'x86_64':                                                                                6 
6     extra_compile_args += cputune                                                                                 
7                                                                                                                  
8 if os.name != 'nt':                                                                                               
9     extra_compile_args += ['-O3', '-ffast-math', '-fno-associative-math']

Do I comment lines 1-6 or just delete the variable inside the brackets?

Also, I tried to run setup.py and there are command line arguments? I think they are required, so I can't get past the argparser. Do we have to provide those command line arguments?

...update: So since I last wrote, I cloned repo, commented lines 1-6 for now in setup.py, cd into annoy source folder and did pip install ., gonna see if I can run the rest of my scripts and build the index without getting and illegal instruction error

bikegirl avatar Jul 23 '20 20:07 bikegirl

Ok, thanks for experimenting with it! I'm going to see if there's a better solution for this. Enough people have been running into issues for me to think maybe I should consider a different approach. I might give it a shot to just use Eigen for dot products, or something similar.

erikbern avatar Jul 24 '20 02:07 erikbern

Okay, so in case this helps anyone else, steps I took to by-pass the 'illegal instruction' error:

  1. activate your virtenv (source ~/ann/bin/activate)
  2. git clone annoy repo
  3. `cd annoy/'
  4. `vim setup.py' Inside setup.py I comment out lines 1-6 from the file (NOTE: line #'s here do not correspond to the line #'s in the actual setup.py file I just did it below for ease of understanding)
# Not all CPUs have march as a tuning parameter                                                                   
1 cputune = ['-march=native',]                                                                                      
2 if platform.machine() == 'ppc64le':                                                                               
3     extra_compile_args += ['-mcpu=native',]                                                                       
4                                                                                                                  
5 if platform.machine() == 'x86_64':                                                                                6 
6     extra_compile_args += cputune                                                                                 
7                                                                                                                  
8 if os.name != 'nt':                                                                                               
9     extra_compile_args += ['-O3', '-ffast-math', '-fno-associative-math']
  1. :wq
  2. pip install .
  3. delete repo you don't need it anymore (ensure that you did get a confirmation message that it did install though)

I used .build_on_disk() and was able to build an annoy index successfully. Will keep you updated if something goes wrong.

bikegirl avatar Jul 25 '20 21:07 bikegirl

Great! I'll investigate if there's some way to fix the underlying issue

erikbern avatar Jul 26 '20 02:07 erikbern

This may be the problem I have been seeing on our setup as well. Everything is built in docker images, so originally I wasn't sure why I was getting failures. And since Annoy is integrated into our sagemaker pipeline, failures were some what opaque.

Plus, images built in our CI/CD pipeline were the only instances which would cause the error, but local docker builds would be fine. I'll experiment with these flags to see if the issue is resolved. Side note, the only reason why I guess it was a problem with annoy was comparing the two docker images with https://github.com/GoogleContainerTools/container-diff.

mmaybeno avatar Aug 01 '20 05:08 mmaybeno

For anyone that has issues with this and builds annoy in docker, I used this line to override the ANNOY_COMPILER_ARGS environment variable per @erikbern's suggestion. Thanks!

ENV ANNOY_COMPILER_ARGS -D_CRT_SECURE_NO_WARNINGS,-DANNOYLIB_MULTITHREADED_BUILD,-mtune=native

mmaybeno avatar Aug 05 '20 15:08 mmaybeno

If anyone wants to give it a shot and try to fix this, I think the best approach is something like

  1. Make the distance function calculation a template argument of AnnoyIndex (but not AnnoyIndexInterface)
  2. At the point of instantiation of the index https://github.com/spotify/annoy/blob/master/src/annoymodule.cc#L153 – add (a) and #ifdef that checks if AVX is supported (b) checks if (__builtin_cpu_supports("avx2")) in runtime

Maybe there's other solutions too!

erikbern avatar Aug 05 '20 15:08 erikbern

Yes, I know for a fact our University's cluster of CPUs does not support avx2, only regular avx, and I think this is where that error stems from also (I'm assuming that differentiation between avx2 and avx is also significant).

bikegirl avatar Aug 06 '20 18:08 bikegirl

These are some other erros I have come accross along side trying to find a workaround for illegal instruction:

1)RuntimeError: Didn't find engine for operation quantized::linear_prepack NoQEngine (operator() at /pytorch/aten/src/ATen/native/quantized/cpu/qlinear_prepack.cpp:202)
 
2)RuntimeError: Your CPU does not support FBGEMM

"FBGEMM requires gcc 5+ and a CPU with support for avx2 instruction set or higher." -->> from googling this issue

My guess would be that you need to disable FBGEMM, at least when you run on architectures that don't support avx2. There appears to be a USE_FBGEMM environment variable which can be set to "0" to disable the use of FBGEMM.

bikegirl avatar Aug 06 '20 18:08 bikegirl

are those last ones related to Annoy? seems like it says something about pytorch

erikbern avatar Aug 06 '20 20:08 erikbern

I remember experiencing it as one of the many errors during the illegal instruction issue I was having, but you're right... this might not be particular to that and maybe more to do my virtenv. Not sure, thought I'd mention it.

bikegirl avatar Aug 07 '20 02:08 bikegirl

For anyone that has issues with this and builds annoy in docker, I used this line to override the ANNOY_COMPILER_ARGS environment variable per @erikbern's suggestion. Thanks!

ENV ANNOY_COMPILER_ARGS -D_CRT_SECURE_NO_WARNINGS,-DANNOYLIB_MULTITHREADED_BUILD,-mtune=native

Yeahhhhhhh! You saved my mental health 😃 Solution Working on AWS ECS

italo1983 avatar Jan 20 '21 09:01 italo1983

Hello may i know the exact line that you used to install annoy in docker? Not sure how to pass this argument into my dockerfile

gary-x0pa avatar Mar 09 '22 06:03 gary-x0pa

Hello may i know the exact line that you used to install annoy in docker? Not sure how to pass this argument into my dockerfile

It has been a few years, and there is probably a better way to do this now. But originally included these in my dockerfile

FROM python:3.7.5-slim-stretch

ENV ANNOY_COMPILER_ARGS -D_CRT_SECURE_NO_WARNINGS,-DANNOYLIB_MULTITHREADED_BUILD,-mtune=native

From there you can include anything else in your dockerfile including installing annoy.

mmaybeno avatar Mar 11 '22 00:03 mmaybeno

@mmaybeno is there a better way to do this now?

ankit-db avatar Jun 01 '23 05:06 ankit-db