AMDGPU.jl
AMDGPU.jl copied to clipboard
AMD GPU (ROCm) programming in Julia
@jpsamaroo I also improved the dispatch for complex `ROCArray`.
Update doc section referring to queue handling to reflect latest syntax.
Some libraries, like rocSPARSE, call HIP functions which expect to be passed allocations generated from `hipMalloc` and friends. Because `hipMalloc` just ends up calling HSA allocation functions, we should be...
Previously, a call to `killqueue()` would not clean up the global `QUEUES` list. On my device, a new call to `Queue()` would reuse the same queue pointer, and this would...
I've noticed a number of competing techniques used to manage data races throughout the codebase. There is the RT_LOCK (I assume RT = runtime) used to manage global state access...
This test currently fails (Julia 1.8, MI250x) https://github.com/JuliaGPU/AMDGPU.jl/blob/e0a48dd9aadc0329e176a983ff0d7ee0e824b252/test/hsa/memory.jl#L184 with following error ``` Region API Queries: Test Failed at /pfs/lustrep4/scratch/project_465000139/lurass/AMDGPU.jl/test/hsa/memory.jl:184 Expression: all(Runtime.region_host_accessible, regions_finegrained) Stacktrace: [1] macro expansion @ /pfs/lustrep4/scratch/project_465000139/lurass/julia_local/julia-1.8.0/share/julia/stdlib/v1.8/Test/src/Test.jl:464 [inlined] [2]...
Hello, I just got a new AMD RX 6700 XT and have been testing out some AMDGPU features with it. when using it for the first time, I got the...
CUDA has a great feature for sizing threads and blocks, namely launch_configuration(). I rarely manually size my kernel, instead something like: ``` kernel = @cuda launch=false myfunc(args...) config = launch_configuration(kernel.fun)...