Peter Simon
Peter Simon
This is a continuation of the benchmarking I presented in #159. There, I presented the results of running the `perf/lu.jl` script of the `RecursiveFactorization` package for a Linux desktop machine....
Out of curiosity I commented out the line in the script ``` #BLAS.set_num_threads(nc) ``` and restarted Julia with `-t 8` using OpenBLAS. The result is:  That didn't...
Why does OpenBLAS perform so much worse on Windows than on Linux?
8 threads on my system: ```julia julia> versioninfo() Julia Version 1.7.3 Commit 742b9abb4d (2022-05-06 12:58 UTC) Platform Info: OS: Linux (x86_64-pc-linux-gnu) CPU: Intel(R) Core(TM) i7-9700 CPU @ 3.00GHz WORD_SIZE: 64...
I'm happy to wait for the rewrite. However, I'm also not seeing the advertised speedup on Float64: | Algorithm | n = 200 | n = 500 | n =...
So the default algorithm was selected to be `RFLUFactorization` for the n = 200 and n=500 cases and it shouldn't be expected to be competitive for the n = 2000...
```julia julia> versioninfo(verbose=true) Julia Version 1.7.3 Commit 742b9abb4d (2022-05-06 12:58 UTC) Platform Info: OS: Linux (x86_64-pc-linux-gnu) "Manjaro Linux" uname: Linux 5.10.126-1-MANJARO #1 SMP PREEMPT Mon Jun 27 10:02:42 UTC 2022...
Thanks for your fast response on this issue. Looking forward to using this for complex matrices (ubiquitous in my work) in the future.
Julia was started with 8 threads. BLAS threading was set by the script: ```julia nc = min(Int(VectorizationBase.num_cores()), Threads.nthreads()) BLAS.set_num_threads(nc) ``` which looks like it would be 8 as well.
For completeness, here is the result of bench-marking after `using MKL`: 