AMDGPU.jl
AMDGPU.jl copied to clipboard
Support non-bitstype arguments and allocations
It should be possible to support non-bitstype arguments and possibly on-device allocations with a bit of elbow grease, as long as we allocate all non-bitstype structures entirely on HSA finegrained memory blocks (even when they reference other device memory blocks). We'll probably need to provide:
- [ ] A way to allocate non-bitstype objects on the host on only HSA memory blocks (Cassette?)
- [ ] Convert any device-side allocations from Julia addrspaces and conventions to AMDGPU equivalents
- [ ] malloc/free on the device
- [ ] An optional verifier pass to ensure all non-bitstype arguments only point to valid (device-reachable) memory locations
- [ ] All sorts of tests to ensure this actually works
I guess this issue would cover unions of bits types? As in this CUDA issue?
Right now when I run:
using AMDGPU
data = Union{Float64, Int64}[1.0f0, 2]
gpu_data = AMDGPU.ROCArray(data)
I get:
ERROR: AssertionError: ROCArray only supports bits types
Stacktrace:
[1] (ROCArray{Union{Float64, Int64}, 1})(buf::GPUArrays.DataRef{AMDGPU.Runtime.Mem.HIPBuffer}, dims::Tuple{Int64}; offset::Int64)
@ AMDGPU ~/.julia/packages/AMDGPU/YWwia/src/array.jl:19
[2] (ROCArray{Union{Float64, Int64}, 1})(buf::GPUArrays.DataRef{AMDGPU.Runtime.Mem.HIPBuffer}, dims::Tuple{Int64})
@ AMDGPU ~/.julia/packages/AMDGPU/YWwia/src/array.jl:16
[3] (ROCArray{Union{Float64, Int64}, 1})(::UndefInitializer, dims::Tuple{Int64})
@ AMDGPU ~/.julia/packages/AMDGPU/YWwia/src/array.jl:77
[4] (ROCArray{Union{Float64, Int64}, 1})(x::Vector{Union{Float64, Int64}}, dims::Tuple{Int64})
@ AMDGPU ~/.julia/packages/AMDGPU/YWwia/src/array.jl:100
[5] (ROCArray{Union{Float64, Int64}, 1})(x::Vector{Union{Float64, Int64}})
@ AMDGPU ~/.julia/packages/AMDGPU/YWwia/src/array.jl:123
[6] ROCArray(A::Vector{Union{Float64, Int64}})
@ AMDGPU ~/.julia/packages/AMDGPU/YWwia/src/array.jl:127
[7] top-level scope
@ REPL[77]:1
[8] top-level scope
(The contrived example is just for simplicity. What I really wanted to do was use Flux.jl with Grassmann.jl to optimize geometric stuff. Currently Grassmann is relying on unions of bits types for convenience)