bwa icon indicating copy to clipboard operation
bwa copied to clipboard

Does not compile for ARM

Open mattchatporter opened this issue 3 years ago • 14 comments

Attempting to run make on an ARM system yields the following error:

ksw.c:29:10: fatal error: emmintrin.h: No such file or directory
29 | #include <emmintrin.h>

mattchatporter avatar Sep 30 '20 15:09 mattchatporter

bwa requires SSE2 instruction set. emmintrin.h is header file for SSE2 intrinsics. You are compiling the code for platform without SSE2 support.

tomaskopsa avatar Oct 03 '20 21:10 tomaskopsa

Exactly - and considering that the following are true:

  • ARM workloads are typically 40% more cost-effective when compared to running the same workload on an Intel CPU (e.g. EC2 c6g)
  • Alignment is one of the most expensive steps in whole genome sequencing analysis
  • Compiling software for ARM can often be as simple as using the appropriate ARM compiler and setting the appropriate flags

It makes good sense then for a high-value, (hopefully) low-effort update to be made to this project to support ARM. Maybe there's more work involved here than a simple compiler & flag switch, but if another existing aligner adds ARM support while BWA has no announced plans to support this, concerns around cost performance as well as future-looking hardware support will cause it to start looking less attractive as a real-world use-case tool.

mattchatporter avatar Oct 05 '20 17:10 mattchatporter

Where does the 40% number come from? Even if that is true for a particular benchmark, it is probably not generalizable. You are underestimating how high-performance tools are developed these days. At times it is more complicated than changing flags. In case of bwa-mem, you need to emulate SSE2 with simde or sse2neon. This is not a one-to-one translation and will cost performance. It is even trickier with bwa-mem2 as most ARM CPUs don't support 256-bit SIMD, let alone 512-bit. Also, the cache performance/bandwidth, branch predictions etc are different between CPUs. You need architecture specific optimization. I doubt simple port can match the performance on x86 CPUs.

The ARM ecosystem is tiny right now. I will see how it evolves in particular with the availability of ARM Mac, but for now, no plan for ARM support.

lh3 avatar Oct 05 '20 18:10 lh3

Good news is that binaries compiled for Intel Macs run on M1 Macs, confirmed as I recently upgraded my 2018 Intel MacBook Pro to a new M1 MacBook Pro.

Emulated bwa on my M1 laptop is about 25% faster than native bwa on the Intel laptop which was a pleasant surprise.

barkoneus avatar Dec 14 '20 23:12 barkoneus

Good news is that binaries compiled for Intel Macs run on M1 Macs, confirmed as I recently upgraded my 2018 Intel MacBook Pro to a new M1 MacBook Pro.

Emulated bwa on my M1 laptop is about 25% faster than native bwa on the Intel laptop which was a pleasant surprise.

Hey Hi , can you please tell me how can I achieve the same? I have intelbased Macbook while my sister has M1. How can I copy the bwa from my machine to her system?

khanshahan avatar Jan 04 '21 22:01 khanshahan

Yes, all you need to do is copy over the bwa executable file from the Intel machine to the M1 machine. Nothing else needs to be done.

barkoneus avatar Jan 04 '21 23:01 barkoneus

Thank you very much @barkoneus , this worked :)

khanshahan avatar Jan 05 '21 15:01 khanshahan

This can help to compile for ARM https://aws.amazon.com/blogs/publicsector/generalized-approach-benchmarking-genomics-workloads-cloud-bwa-read-aligner-graviton2/

markotitel avatar Aug 02 '21 11:08 markotitel

sounds easy but HOW to do it? (1) where is the executable (2) where (which directory) to copy from intel into the new M1 ?

:(

edubielalpizar avatar Dec 19 '21 19:12 edubielalpizar

I'd love to see this implemented. There are two separate PRs open right now to fix this: https://github.com/lh3/bwa/pull/283 and https://github.com/lh3/bwa/pull/344.

SIMDE is probably a more robust solution long-term, but the sse2neon.h solution is a smaller changeset.

Right now, there's multiple outside groups patching bwa to run it on ARM CPUs, everything from distro maintainers like debian, to AWS telling their customers how to build bwa on ARM. It'd be great to get first party ARM support.

pettyalex avatar Mar 14 '22 20:03 pettyalex

It would be awesome if the support for ARM64 is added!

martin-g avatar Apr 20 '22 07:04 martin-g

BTW, the issue seems same for PPC64, S390X and RISC-V architectures.

V-Z avatar May 13 '22 11:05 V-Z

How can we help to get the support for ARM64 ?

julien-faye avatar May 13 '22 12:05 julien-faye

See also PR #359.

jmarshall avatar Jun 26 '22 23:06 jmarshall