villekf

Results 12 comments of villekf

Here are some tuning results from Intel Xeon E5-2630 v3 and v4, as well as Nvidia Tesla P100 PCI-E 16 GB. [CLBlast_tuners.zip](https://github.com/CNugteren/CLBlast/files/2262652/CLBlast_tuners.zip)

With `export AF_TRACE=all` the output is `[platform][1589810797][014006] [ ../src/backend/common/DependencyModule.cpp:55 ] Attempting to load: libforge.so` after which the crash occurs. libforge.so can only be found from the AF install lib64-folder.

If the Forge files are in the library folder, that is all the output there is (MATLAB segfaults). If I remove them, the output becomes this on OpenCL (and everything...

I'm using OpenCL, but the same thing happens for both CPU and CUDA as well. Running the conway_opencl example produces the following when using the installer libraries: ``` [platform][1589814692][008749] [...

I believe, for example, ``` if (temp != '.') timeStart_str=timeStart_str+"0"; ``` should be ``` if (temp != '.') timeStart_str=timeStart_str+"."; ``` instead.

It seems that this slowdown happens when the sparse matrix components are computed in a separate function. The following code reproduces this on my end consistently (yeah, using random numbers...

Is there any update on this issue? Or is there some workaround that doesn't involve lower dimensions?

Same issue is also present when using `inverse` instead, e.g. `KG = matmul(inverse(HH), KG);`

You can easily add the mean and standard deviation to normally distributed values by simply using `x = µ + randn(N) * σ`.