James Schloss
James Schloss
Another stupid question, is KA broken for AMDGPU right now? I cannot seem to launch kernels off of `ROCDevice()`s anymore... I can just work on an older version of KA...
Great! I'll stick with 0.8 for a bit then. Sorry to divert the thread!
I'm struggling to find much at all on warp-level semantics for metal or even OneAPI. It seems like OpenCL just ignores it(?): https://stackoverflow.com/questions/42259118/is-there-any-guarantee-that-all-of-threads-in-wavefront-opencl-always-synchron To be honest, I haven't seen an...
I would guess the outlier here is Metal (and parallel CPU) then? I think AMD (wavefronts), CUDA (warps), and Intel (subgroups) all have some concept of warp-level operations; however, I...
It also looks like vulkan is trying to standardize the terminology as well: https://www.khronos.org/blog/vulkan-subgroup-tutorial. Their API is supposed to be similar to OpenCL for compute, but I cannot find such...
Sorry for the delay in responding here. To be honest, it might take some work to get the old blender script up and running. A lot of things have changed...
Are you still hoping to use blender python for this or would you prefer to use some other software? I could whip up an example in julia in like an...
Well, in this case I don't know if Blender is the right software either. It's great for animations, but not (necessarily) for games. Usually people use blender to create the...
I created a partially fixed version of the 4D script in #32, but don't recomment using it. I only really checked the `visualize_fourth_dimension` function, which is the one that does...
Ah, I guess while I'm here, I'll briefly explain the differences with CUDA syntactically: 1. Indexing is easier: `@index(Global / Group / Local, Linear / NTuple / CartesianIndex)` vs `(blockIdx().x...