Pradeep Garigipati

Results 267 comments of Pradeep Garigipati

@UmashankarTriforce Sorry for a late response. @UmashankarTriforce @rahul-ar Yes, join CUDA/OpenCL can still use this performance improvement.

@rahul-ar We have our [slack channel][1] for quick questions. But it would be better to iron out the approach/design here before going ahead with the implementation. Given that you are...

@willyborn Thank you, will have a look soon and let you know.

@willyborn Sorry about the delay on my end, I just returned to work after a long break. I see that you have already raised PRs #3144 #3145 . Do they...

@anates Sorry about the delay. What is the BLAS library backend you used to build ArrayFire's CPU backend ?

In case you are using IntelMKL, please go through the conversation in the issue - https://github.com/arrayfire/arrayfire/issues/3042 it might help to check if some MKL variable can be set to resolve...

Can you try setting [`MKL_NUM_THREADS`](https://software.intel.com/content/www/us/en/develop/documentation/mkl-linux-developer-guide/top/managing-performance-and-memory/improving-performance-with-threading/using-additional-threading-control/intel-mkl-specific-environment-variables-for-openmp-threading-control.html) and check if that changes the behavior.

> Can you try setting [`MKL_NUM_THREADS`](https://software.intel.com/content/www/us/en/develop/documentation/mkl-linux-developer-guide/top/managing-performance-and-memory/improving-performance-with-threading/using-additional-threading-control/intel-mkl-specific-environment-variables-for-openmp-threading-control.html) and check if that changes the behavior. @anates Any change ? or is it still using single thread ?

@anates If you have resolved the issue by any changes to you development setup, please share details here for any future users who may face the similar issue.

@willyborn I can't recollect why I avoided hashing JIT code, perhaps to avoid extra computation time. Fixed version solution might not work in a couple of corner cases such as...