xqch1983

Results 5 issues of xqch1983

if we have do some work to improve the performance of radix_sort_by_key( ), as i tested , the perf is 11ms per 1m element size. while ~1.15ms in rocmPRIM(OpenCL) and...

performance

we wonder to know how to set some environment parameters to make GPU_MAX_WORKGROUP_SIZE bigger than 256 and effectively in OpenCL kernel, 256 is the default max value now. we have...

I configured parameters as bellow: ./configure --prefix=/tmpdata2/DevGroup/xieqingchun/tools/blis_lib -t openmp --enable-cblas CC=gcc CFLAGS=“-fomit-frame-pointer” haswell how to avoid the build error by correct configuration? thanks Then type "make " in terminal: the...

The clbas_zomatadd should be supported same with the interface in MKL. implement the addition of two matrixes as bellow: C := alpha*op(A) + beta*op(B) The API in MKL is [mkl_?omatadd](https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-c/2023-1/mkl-omatadd.html),...

Rename branch name frome xqch to ocl_openTLD