Valentin Churavy

Results 1413 comments of Valentin Churavy

Having this would be fantastic! Especially for scientific applications rendering on a server.

> Is it anyways the recommended guideline to setup kernels? It is a bit tricky between CPU and GPU. Right now the the KA backend on the CPU is rather...

It would be interesting to use `CUDA.@profile` to see if the kernel slowed down or the "auto-tunning" adds that overhead

So the default workgroupsize for KA is 1024. With 64 you create a lot of small tasks, what is the typical ndrange you use?

Ah so you are getting perfectly sized blocks, by accident xD You may want to use `(64, 64)` instead as the workgroup size.

Yeah I will need to improve this on the KA side

I just tagged a new KA version with the fix. This might remove the need for the SIMD variant entirely.

This is fantastic! I have long been wanting to explore more fine-grained compilation caching using approaches like `llvm-cas`. From a cursory look this seems similar to how GPUCompilers on-disk cache...

I am hesitant about the inclusion of AI features by default. For my own use-cases, I avoid AI often since I find it fairly distracting. For my classroom, I would...

Yeah a `frontmatter` switch would suffice for most of the cases I am thinking about.