Julian Samaroo

Results 172 issues of Julian Samaroo

To be thread-safe, we need at least one lock around all runtime operations which mutate global state (such as `DEFAULT_AGENT`/`DEFAULT_QUEUE`).

bug
hsa

It should be possible to support non-bitstype arguments and possibly on-device allocations with a bit of elbow grease, as long as we allocate all non-bitstype structures entirely on HSA finegrained...

The LLVM AMDGPU target has features like XNACK that we might want to enable in certain cases, like wavefront debugging. We should document each known feature and provide a way...

LLVM and the ROCm device libs expose the necessary functions to access the owning queue for a kernel and place packets on it. We should implement the equivalent of CUDAnative's...

https://github.com/ROCmSoftwarePlatform/rocSPARSE

https://github.com/ROCmSoftwarePlatform/rocALUTION

https://github.com/ROCmSoftwarePlatform/MIOpen This one should be doable in pieces, as MIOpen is a pretty large library. Having even partial MIOpen support will make https://github.com/FluxML/Flux.jl/pull/938 more useful.

Read-only memory can be allocated via the HSA runtime, and can be potentially much faster for reads than regular global memory. We should support working with this memory via the...

For interactions with HIP, which uses an implicit, incrementing device ID similar to CUDA, we should provide functions that can map from HSA's agents to HIP's integer device IDs, and...