nSircombe
nSircombe
There are a number of issues with the current AArch64 whl build in [https://github.com/pytorch/builder/blob/master/build_aarch64_wheel.py](build_aarch64_wheel.py) which appear to be impacting the performance of the finished whl. 1. OpenBLAS has not been...
Currently, AArch64 builds rely on a single-threaded build of OpenBLAS, see: https://github.com/pytorch/builder/blob/8e799eb4708069db379dba20b1f324040f5e991e/build_aarch64_wheel.py#L182 Inclusion of OpenMP is marked as a `TODO`. Enabling support should be a matter of adding the `USE_OPENMP=1`...
The transformational Fortran 2008 intrinsic NORM2 was added in #786. Resent testing has shown unexpected overflows for large double-precision arguments, specifically when the square of an element in the argument...