Julian Samaroo comments

Results 413 comments of


                                            Julian Samaroo

ROCKernels: Update to AMDGPU 0.4

@vchuravy the `

ROCKernels: Update to AMDGPU 0.4

Ok, now all calls generated by the `@kernel` macro forward down to `KernelAbstractions.construct`, which is defined for `::Device`, and ROCKernels defines its own for `::ROCDevice`.

ROCKernels: Update to AMDGPU 0.4

Yeah, the MPI failures seem to be something specific to CI. Let's hold off on merging this until: - We've tested that this works properly for CUDAKernels (since we change...

ROCKernels: Update to AMDGPU 0.4

https://github.com/JuliaGPU/AMDGPU.jl/pull/280

Launch kernels and dependencies

I want to recommend using AMDGPU's behavior, but it requires intrusive changes within the GPU array objects to support it, and likely adds some overhead during kernel launch (to search...

Docs for AMD support.

So CUDAKernels/ROCKernels are the packages you need to load that to get CUDA/AMDGPU support with KA, respectively. They each export `CUDADevice`/`ROCDevice`, which can be passed as the first argument (instead...

Julia does not support cross-compilation to 32-bits

I suspect running with opaque pointers in Julia 1.10 should probably resolve this, as the pointer size is not hard-coded into the IR.

[RFC] Dagger Integration

> I suspect the main difficulty is the way that TimeDag.run_node! currently works by mutating the node state [1]. I assume (?) that this is an awkward thing to support...

where is the sample code for usage?

Hey @RuyiDu , not sure if you're still looking for an answer to this question (which I think is a valid one). I just got uGUI running on my STM32F746-Discovery...

Incorrect return value of tuple with `compile_shlib`

Looking at `@code_llvm`, returning a `Tuple` uses the `sret` calling convention, which means that the first argument to the function is a slot allocated on the stack that the result...