Andrew
Andrew
visual studio includes old msomp and llvm openmp if using clang-CL.EXE you need to select -openmp=llvm or ms
Use plain clang.exe clang-cl is just partial cl.exe replica Or: https://devblogs.microsoft.com/cppblog/improved-openmp-support-for-cpp-in-visual-studio/
it links to older openmp provided by microsoft despite one cmake detected. Use clang.exe for CC, kind of easy.
.text is 10MB which is reasonable for single cpu type. Frankly no idea how microsoft omp hangs in the way, it is not in visual studio by default.
Please present some measurements. like integrate various OpenBLAS builds in octave or R and run same benchmark scripts over and over. You needed OpenMP, which means you can call OpenBLAS...
Call OpenBLAS from top level, not from within extra OpenMP pragmas? Should be obvious if you program OpenMP.
It is one bit of precision off, very normal occurrence computing in different order.
You abuse machine rounding precision 32 times (or 16 with FMA) , discount 5 bits in your check. It is not magic symbolic computation soup that gives accurate poly result...
It is rounding to output precision to store in a register at every 1 or 2 FLOP-s 50% up 50% down and so lottery continues till the end of computation....
just a guess that intel uses generic code for small inputs, then gradually jumps to vector code and adds CPU threads as samples grow. Openblas uses vector code always and...