AMDGPU.jl icon indicating copy to clipboard operation
AMDGPU.jl copied to clipboard

2-`norm` for views of ROCArray falls back to scalar indexing

Open luraess opened this issue 11 months ago • 0 comments

This is related to https://github.com/JuliaGPU/CUDA.jl/issues/2280

When taking the 2-norm (or any p-norm apart from 1- and Inf-norm) of a CuArray view, the implementation errors due to scalar iteration. Interestingly, 1-norm and Inf-norm don't result in an error.

The Minimal Working Example (MWE) for this bug:

julia> AMDGPU.allowscalar(false)

julia> F = AMDGPU.rand(Float64, 10, 10);

julia> using LinearAlgebra

julia> norm(F, 1)
48.031221724616245

julia> norm(F, 1)
48.031221724616245

julia> norm(F, Inf)
0.9953361044936007

julia> norm(F, 2)
5.651939527772477

julia> F_v = @view F[2:end-1,2:end-1];

julia> norm(F_v, 1)
31.142444509460844

julia> norm(F_v, Inf)
0.9953361044936007

julia> norm(F_v, 2)
ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore should be avoided.

If you want to allow scalar iteration, use `allowscalar` or `@allowscalar`
to enable scalar iteration globally or for the operations in question.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] errorscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:155
  [3] _assertscalar(op::String, behavior::GPUArraysCore.ScalarIndexing)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:128
  [4] assertscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:116
  [5] getindex
    @ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/indexing.jl:48 [inlined]
  [6] scalar_getindex(::ROCArray{Float64, 2, AMDGPU.Runtime.Mem.HIPBuffer}, ::Int64, ::Vararg{Int64})
    @ GPUArrays ~/.julia/packages/GPUArrays/Hd5Sk/src/host/indexing.jl:34
  [7] _getindex
    @ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/indexing.jl:17 [inlined]
  [8] getindex
    @ ~/.julia/packages/GPUArrays/Hd5Sk/src/host/indexing.jl:15 [inlined]
  [9] getindex
    @ ./subarray.jl:290 [inlined]
 [10] _getindex
    @ ./abstractarray.jl:1341 [inlined]
 [11] getindex
    @ ./abstractarray.jl:1291 [inlined]
 [12] iterate
    @ ./abstractarray.jl:1217 [inlined]
 [13] iterate
    @ ./abstractarray.jl:1215 [inlined]
 [14] generic_norm2(x::SubArray{Float64, 2, ROCArray{…}, Tuple{…}, false})
    @ LinearAlgebra ~/.julia/juliaup/julia-1.10.2+0.x64.linux.gnu/share/julia/stdlib/v1.10/LinearAlgebra/src/generic.jl:465
 [15] norm2
    @ ~/.julia/juliaup/julia-1.10.2+0.x64.linux.gnu/share/julia/stdlib/v1.10/LinearAlgebra/src/generic.jl:529 [inlined]
 [16] norm(itr::SubArray{Float64, 2, ROCArray{…}, Tuple{…}, false}, p::Int64)
    @ LinearAlgebra ~/.julia/juliaup/julia-1.10.2+0.x64.linux.gnu/share/julia/stdlib/v1.10/LinearAlgebra/src/generic.jl:598
 [17] top-level scope
    @ REPL[25]:1
 [18] top-level scope
    @ ~/.julia/packages/AMDGPU/gtxsf/src/tls.jl:200
Some type information was truncated. Use `show(err)` to see complete types.

julia>

The 2-norm of a CuArray view should behave similarly to other norms, calling CUDA implementation.

Version info

julia> versioninfo()
Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 16 × AMD Ryzen 7 5800X 8-Core Processor
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 16 virtual cores)

and

julia> AMDGPU.versioninfo()
[ Info: AMDGPU versioninfo
┌───────────┬──────────────────┬───────────┬──────────────────────────────────────────────────────────────────────────────
│ Available │ Name             │ Version   │ Path                                                                        ⋯
├───────────┼──────────────────┼───────────┼──────────────────────────────────────────────────────────────────────────────
│     +     │ LLD              │ -         │ /opt/rocm/llvm/bin/ld.lld                                                   ⋯
│     +     │ Device Libraries │ -         │ /home/lraess/.julia/artifacts/5ad5ecb46e3c334821f54c1feecc6c152b7b6a45/amdg ⋯
│     +     │ HIP              │ 6.0.32830 │ /opt/rocm-6.0.0/lib/libamdhip64.so                                          ⋯
│     +     │ rocBLAS          │ 4.0.0     │ /opt/rocm-6.0.0/lib/librocblas.so                                           ⋯
│     +     │ rocSOLVER        │ 3.24.0    │ /opt/rocm-6.0.0/lib/librocsolver.so                                         ⋯
│     +     │ rocALUTION       │ -         │ /opt/rocm-6.0.0/lib/librocalution.so                                        ⋯
│     +     │ rocSPARSE        │ -         │ /opt/rocm-6.0.0/lib/librocsparse.so                                         ⋯
│     +     │ rocRAND          │ 2.10.5    │ /opt/rocm-6.0.0/lib/librocrand.so                                           ⋯
│     +     │ rocFFT           │ 1.0.21    │ /opt/rocm-6.0.0/lib/librocfft.so                                            ⋯
│     +     │ MIOpen           │ 3.0.0     │ /opt/rocm-6.0.0/lib/libMIOpen.so                                            ⋯
└───────────┴──────────────────┴───────────┴──────────────────────────────────────────────────────────────────────────────
                                                                                                          1 column omitted

[ Info: AMDGPU devices
┌────┬───────────────────────┬──────────┬───────────┬────────────┐
│ Id │                  Name │ GCN arch │ Wavefront │     Memory │
├────┼───────────────────────┼──────────┼───────────┼────────────┤
│  1 │ AMD Radeon RX 7800 XT │  gfx1101 │        32 │ 15.984 GiB │
└────┴───────────────────────┴──────────┴───────────┴────────────┘

luraess avatar Mar 04 '24 08:03 luraess