OpenBLAS icon indicating copy to clipboard operation
OpenBLAS copied to clipboard

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.

Results 282 OpenBLAS issues
Sort by recently updated
recently updated
newest added

All I know is that this builds and works fine with clangarm64 on my laptop. Unsure about performance improvement, but certainly no performance regression. I am not an assembly wizard,...

How to optimize when use zgemm(T,N,m,n,k),m=n100000,lda=ldb=ldc=k

The check for GCC is confused by the GNU-stack in ``` .file "FIRModule" .text .globl zhoge_ .p2align 4 .type zhoge_,@function zhoge_: xorps %xmm0, %xmm0 xorps %xmm1, %xmm1 retq .Lfunc_end0: .size...

I had a brief bit of confusion because the docs suggested that `OMP_NUM_THREADS` would only affect OpenMP builds, when actually it's used as a fallback in non-OpenMP builds as well....

Very large numbers of calls to the symmetric complex eigenproblem via numpy.linalg.eigh() can have dramatic slowdowns due to multi-threading, especially when the number of cores is large and when there...

This PR adds the thread thresholding for Power10 by introducing get_gemv_optimal_nthreads_power10 function.

environment: operating system: windows 10. IDE: visual studio 2022 community. opencv version opencv-4.12.0. i use cmake gui build opencv-4.12.0, using the binary openblas0.30.0, cmake unsurport this version. so i build...

Building commit d23680b81d5179ce6ae1ca5546303b81646ecac1 with `make -j DYNAMIC_ARCH=1 TARGET=VORTEX` results in test failures on Apple M4: ``` TEST 1135/1522 sgemmt:c_api_rowmajor_upper_M_50_K_50_a_notrans_b_notrans [FAIL] ERR: test_extensions/test_sgemmt.c:797 expected 0.000e+00, got 2.741e-01 (diff -2.741e-01, tol 1.000e-04)...

Forwarded bug report from Julia: https://github.com/JuliaLang/LinearAlgebra.jl/issues/1463 The error we get in SNRM2 is much higher than it should when computed on Apple ARM64. We tried the same thing on AMD...