Devin Matthews

Results 24 issues of Devin Matthews

The mechanism(s) used for backoff must be defined by the configuration or configuration family. Up to three successive mechanisms are supported (e.g. pause/sched_yield/sleep). Example (in `config//bli_family_.h`): ```C++ #define BLIS_BARRIER_BACKOFF_1 30...

The autoconf macro [ax_compiler_vendor](http://git.savannah.gnu.org/gitweb/?p=autoconf-archive.git;a=blob_plain;f=m4/ax_compiler_vendor.m4) provides an arguably better way to determine the compiler vendor, since it doesn't depend on the vagaries of compiler version strings (although it does require running...

There are some important facets of the various kernels which are not currently tested: - [ ] The case of `beta == 0` in gemm ukrs. We should pre-populate C...

enhancement

The perfect storm happens: 1. On macOS 2. With gcc (in my case 11.2) 3. With debugging enabled (`CFLAGS="-g -O0"`) 4. Using the `haswell` configuration What occurs in my case...

bug
wontfix

Many applications (e.g. in machine learning) end up with matrix products that have small(ish) `m` and `n` dimensions, but large K. For these cases, one would need to parallelize the...

enhancement

- Removed the info and info2 fields and replaced with bitfield entries for each distinct flag/property. - Simplified and rearranged the bitwise definitions of the various enumerations. - Removed many...

See e.g. [here](https://ci.appveyor.com/project/shpc/blis/build/job/nxn5du4s0v5hm7t0). The build passes even though many BLAS tests fail. In this case, I have no idea *why* they fail but since they do so only for shared...

enhancement

Changes intended to fix whatever problems may exist for cross-compiling on Windows from Linux (incl. WSL) and macOS. - [X] macOS (w/ gcc 11) - [ ] Linux (gcc version...

Reposted from #463, as reported by @isuruf: There are more errors with that cross compiling build though. 1. Needs -fno-asynchronous-unwind-tables (we should check `__MINGW64__` macro and add that flag). gcc...

enhancement

Zen optimized amaxv uses __m128 (float) for indices, could overflow/lose precision for large arrays.

enhancement