Andreas Klöckner

Results 320 issues of Andreas Klöckner

Watching some CL program (gory details [here](https://notes.tiker.net/s/3D3-VhuSf)) execute on pocl's CUDA backend via `nvprof` (Nvidia's profiling tool) gives me ``` API calls: 52.24% 1.20894s 72145 16.757us 7.6450us 22.126ms cuLaunchKernel 19.36%...

contributions welcome
CUDA

Flipping through the pthreads backend, it didn't seem as though core affinity for the created threads was getting enabled anywhere. If that's indeed the case, that might be an easy...

enhancement
contributions welcome

I've come across [another example](https://gist.github.com/inducer/8d1ad78c548bf053a85eabc015804484) of a kernel that appears to be miscompiled, resulting in a segfault or complaints about heap corruption. When I run that script on top of...

This kernel here: https://gist.github.com/inducer/8f7cd72829c85acc1d3fcb9c4a5dae05 gets vectorized into full-width 256-bit vectors without a problem by both Intel CL and ispc. (Note how the workgroup size already conveniently matches the expected vector...

Kernel compiler issue

Running ``` flake8==3.9.2 flake8-bugbear==21.4.3 flake8-polyfill==1.0.2 flake8-quotes==3.2.0 ``` on ``` def main(): def f(x): print(i, x) for i in [1, 2, 3]: f('HI') ``` gives me ``` bugbearbug.py:5:9: B007 Loop control...

enhancement
help wanted

For PyOpenCL's [2022.1.6 release](https://github.com/inducer/pyopencl/releases/tag/v2022.1.6), [`CITATION.cff`](https://github.com/inducer/pyopencl/blob/6b68b79e9400e802755735f366eda221973cb606/CITATION.cff) contained the name of an individual (@gw0) for whom only an alias, but no first and family names are known. I represented this in `CITATION.cff`...

Bug

This was a dead-end while debugging #77. No need for it currently, but who knows? Maybe we'll need it sometime.

The analogous issue to https://github.com/inducer/pytato/issues/163 also exists in pymbolic. cc @alexfikl

FP arithmetic isn't associative, after all.

cc @kaushikcfd Prototype remap: https://gist.github.com/inducer/b293430efb27c50d0048cd278540f038