Metal.jl Int64 not supported on AMD GPUs?

I am running Julia Version 1.8.0-rc1 (2022-05-27) on OS X 12.4 with an AMD Radeon Pro 5700 XT GPU.

julia> a .+ 1
┌ Warning: Compilation of MetalLib to native code failed.
│ If you think this is a bug, please file an issue and attach /var/folders/3n/56fpv14n4wj0c1l1sb106pzw0000gn/T/jl_OUC1h1KIc6.metallib.
└ @ Metal ~/.julia/packages/Metal/fQowO/src/compiler/execution.jl:178
ERROR: MtlError: Compiler encountered an internal error (code 2, CompilerError)

Stacktrace:
  [1] macro expansion
    @ ~/.julia/packages/Metal/fQowO/lib/core/helpers.jl:68 [inlined]
  [2] MtlComputePipelineState(d::MtlDevice, f::MtlFunction)
    @ Metal.MTL ~/.julia/packages/Metal/fQowO/lib/core/compute_pipeline.jl:25
  [3] mtlfunction_link(job::GPUCompiler.CompilerJob, compiled::NamedTuple{(:image, :entry), Tuple{Vector{UInt8}, String}})
    @ Metal ~/.julia/packages/Metal/fQowO/src/compiler/execution.jl:172
  [4] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob, compiler::typeof(Metal.mtlfunction_compile), linker::typeof(Metal.mtlfunction_link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/iaKrd/src/cache.jl:95
  [5] mtlfunction(f::GPUArrays.var"#broadcast_kernel#15", tt::Type{Tuple{Metal.mtlKernelContext, MtlDeviceVector{Int64, 1}, Base.Broadcast.Broadcasted{Metal.MtlArrayStyle{1}, Tuple{Base.OneTo{Int64}}, typeof(+), Tuple{Base.Broadcast.Extruded{MtlDeviceVector{Int64, 1}, Tuple{Bool}, Tuple{Int64}}, Int64}}, Int64}}; name::Nothing, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Metal ~/.julia/packages/Metal/fQowO/src/compiler/execution.jl:143
  [6] mtlfunction
    @ ~/.julia/packages/Metal/fQowO/src/compiler/execution.jl:136 [inlined]
  [7] macro expansion
    @ ~/.julia/packages/Metal/fQowO/src/compiler/execution.jl:64 [inlined]
  [8] #launch_heuristic#53
    @ ~/.julia/packages/Metal/fQowO/src/gpuarrays.jl:14 [inlined]
  [9] _copyto!
    @ ~/.julia/packages/GPUArrays/EVTem/src/host/broadcast.jl:73 [inlined]
 [10] copyto!
    @ ~/.julia/packages/GPUArrays/EVTem/src/host/broadcast.jl:56 [inlined]
 [11] copy
    @ ~/.julia/packages/GPUArrays/EVTem/src/host/broadcast.jl:47 [inlined]
 [12] materialize(bc::Base.Broadcast.Broadcasted{Metal.MtlArrayStyle{1}, Nothing, typeof(+), Tuple{MtlArray{Int64, 1}, Int64}})
    @ Base.Broadcast ./broadcast.jl:860
 [13] top-level scope
    @ REPL[6]:1
 [14] top-level scope
    @ ~/.julia/packages/Metal/fQowO/src/initialization.jl:25
]

Here are the details

 % ./usr/bin/julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.8.0-rc1 (2022-05-27)
 _/ |\__'_|_|_|\__'_|  |  
|__/                   |

julia> import Pkg; Pkg.add("Metal")
    Updating registry at `~/.julia/registries/General.toml`
   Resolving package versions...
   Installed GPUArrays ──────────── v8.4.0
   Installed Metal_LLVM_Tools_jll ─ v0.3.0+1
   Installed cmt_jll ────────────── v0.1.0+0
   Installed GPUArraysCore ──────── v0.1.0
   Installed CEnum ──────────────── v0.4.2
   Installed LLVMExtra_jll ──────── v0.0.16+0
   Installed GPUCompiler ────────── v0.16.1
   Installed Metal ──────────────── v0.1.0
   Installed LLVM ───────────────── v4.14.0
  Downloaded artifact: Metal_LLVM_Tools
  Downloaded artifact: LLVMExtra
  Downloaded artifact: cmt
    Updating `~/.julia/environments/v1.8/Project.toml`
  [dde4c033] + Metal v0.1.0
    Updating `~/.julia/environments/v1.8/Manifest.toml`
  [79e6a3ab] + Adapt v3.3.3
  [fa961155] + CEnum v0.4.2
  [e2ba6199] + ExprTools v0.1.8
  [0c68f7d7] + GPUArrays v8.4.0
  [46192b85] + GPUArraysCore v0.1.0
  [61eb1bfa] + GPUCompiler v0.16.1
  [692b3bcd] + JLLWrappers v1.4.1
  [929cbde3] + LLVM v4.14.0
  [dde4c033] + Metal v0.1.0
  [21216c6a] + Preferences v1.3.0
  [189a3867] + Reexport v1.2.2
  [a759f4b9] + TimerOutputs v0.5.20
  [dad2f222] + LLVMExtra_jll v0.0.16+0
  [0418c028] + Metal_LLVM_Tools_jll v0.3.0+1
  [65323cdd] + cmt_jll v0.1.0+0
  [0dad84c5] + ArgTools v1.1.1
  [56f22d72] + Artifacts
  [2a0f44e3] + Base64
  [ade2ca70] + Dates
  [f43a241f] + Downloads v1.6.0
  [7b1f6079] + FileWatching
  [b77e0a4c] + InteractiveUtils
  [4af54fe1] + LazyArtifacts
  [b27032c2] + LibCURL v0.6.3
  [76f85450] + LibGit2
  [8f399da3] + Libdl
  [37e2e46d] + LinearAlgebra
  [56ddb016] + Logging
  [d6f4376e] + Markdown
  [ca575930] + NetworkOptions v1.2.0
  [44cfe95a] + Pkg v1.8.0
  [de0858da] + Printf
  [3fa0cd96] + REPL
  [9a3f8284] + Random
  [ea8e919c] + SHA v0.7.0
  [9e88b42a] + Serialization
  [6462fe0b] + Sockets
  [2f01184e] + SparseArrays
  [10745b16] + Statistics
  [fa267f1f] + TOML v1.0.0
  [a4e569a6] + Tar v1.10.0
  [cf7118a7] + UUIDs
  [4ec0a83e] + Unicode
  [e66e0078] + CompilerSupportLibraries_jll v0.5.2+0
  [deac9b47] + LibCURL_jll v7.81.0+0
  [29816b5a] + LibSSH2_jll v1.10.2+0
  [c8ffd9c3] + MbedTLS_jll v2.28.0+0
  [14a3606d] + MozillaCACerts_jll v2022.2.1
  [4536629a] + OpenBLAS_jll v0.3.20+0
  [83775a58] + Zlib_jll v1.2.12+3
  [8e850b90] + libblastrampoline_jll v5.1.0+0
  [8e850ede] + nghttp2_jll v1.41.0+1
  [3f19e933] + p7zip_jll v17.4.0+0
Precompiling project...
  21 dependencies successfully precompiled in 12 seconds

julia> Metal.versioninfo()
ERROR: UndefVarError: Metal not defined
Stacktrace:
 [1] top-level scope
   @ REPL[2]:1

julia> using Metal

julia> Metal.versioninfo()
macOS 12.4.0, Darwin 21.5.0

Toolchain:
- Julia: 1.8.0-rc1
- LLVM: 13.0.1

1 device:
- AMD Radeon Pro 5700 XT (0 bytes allocated)

julia> a = MtlArray([1])
1-element MtlArray{Int64, 1}:
 1

julia> a .+ 1
┌ Warning: Compilation of MetalLib to native code failed.
│ If you think this is a bug, please file an issue and attach /var/folders/3n/56fpv14n4wj0c1l1sb106pzw0000gn/T/jl_OUC1h1KIc6.metallib.
└ @ Metal ~/.julia/packages/Metal/fQowO/src/compiler/execution.jl:178
ERROR: MtlError: Compiler encountered an internal error (code 2, CompilerError)
Stacktrace:
  [1] macro expansion
    @ ~/.julia/packages/Metal/fQowO/lib/core/helpers.jl:68 [inlined]
  [2] MtlComputePipelineState(d::MtlDevice, f::MtlFunction)
    @ Metal.MTL ~/.julia/packages/Metal/fQowO/lib/core/compute_pipeline.jl:25
  [3] mtlfunction_link(job::GPUCompiler.CompilerJob, compiled::NamedTuple{(:image, :entry), Tuple{Vector{UInt8}, String}})
    @ Metal ~/.julia/packages/Metal/fQowO/src/compiler/execution.jl:172
  [4] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob, compiler::typeof(Metal.mtlfunction_compile), linker::typeof(Metal.mtlfunction_link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/iaKrd/src/cache.jl:95
  [5] mtlfunction(f::GPUArrays.var"#broadcast_kernel#15", tt::Type{Tuple{Metal.mtlKernelContext, MtlDeviceVector{Int64, 1}, Base.Broadcast.Broadcasted{Metal.MtlArrayStyle{1}, Tuple{Base.OneTo{Int64}}, typeof(+), Tuple{Base.Broadcast.Extruded{MtlDeviceVector{Int64, 1}, Tuple{Bool}, Tuple{Int64}}, Int64}}, Int64}}; name::Nothing, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Metal ~/.julia/packages/Metal/fQowO/src/compiler/execution.jl:143
  [6] mtlfunction
    @ ~/.julia/packages/Metal/fQowO/src/compiler/execution.jl:136 [inlined]
  [7] macro expansion
    @ ~/.julia/packages/Metal/fQowO/src/compiler/execution.jl:64 [inlined]
  [8] #launch_heuristic#53
    @ ~/.julia/packages/Metal/fQowO/src/gpuarrays.jl:14 [inlined]
  [9] _copyto!
    @ ~/.julia/packages/GPUArrays/EVTem/src/host/broadcast.jl:73 [inlined]
 [10] copyto!
    @ ~/.julia/packages/GPUArrays/EVTem/src/host/broadcast.jl:56 [inlined]
 [11] copy
    @ ~/.julia/packages/GPUArrays/EVTem/src/host/broadcast.jl:47 [inlined]
 [12] materialize(bc::Base.Broadcast.Broadcasted{Metal.MtlArrayStyle{1}, Nothing, typeof(+), Tuple{MtlArray{Int64, 1}, Int64}})
    @ Base.Broadcast ./broadcast.jl:860
 [13] top-level scope
    @ REPL[6]:1
 [14] top-level scope
    @ ~/.julia/packages/Metal/fQowO/src/initialization.jl:25

julia> device(a)
MtlDevice:
 name:             AMD Radeon Pro 5700 XT
 lowpower:         false
 headless:         false
 removable:        false
 unified memory:   false
 registry id:      4294968934
 transfer rate:    0

julia> task_local_storage()[:MtlDevice] = MtlDevice(1)
MtlDevice:
 name:             AMD Radeon Pro 5700 XT
 lowpower:         false
 headless:         false
 removable:        false
 unified memory:   false
 registry id:      4294968934
 transfer rate:    0

julia>

Jul 01 '22 15:07 dbl001

Int64 and Float64 are not supported and lead to those crashes. Try to convert the data to single-precision

Jul 01 '22 15:07 PhilipVinc

b = convert(Array{Int32}, a)
1-element Vector{Int32}:
 1

julia> b .+ 1
1-element Vector{Int64}:
 2

Jul 01 '22 16:07 dbl001

b = convert(Array{Int32}, a) 1-element Vector{Int32}: 1

julia> b .+ 1 1-element Vector{Int64}: 2

... and? Of course CPU vectors support int64 addition, it's the GPU that's more limited here.

@PhilipVinc Int64 seems to work fine on my M1 though, and I can't imagine it not working since every literal in Julia is a 64-bit integer.

Jul 04 '22 07:07 maleadt

julia> using Metal

julia> a = MtlArray([1])
1-element MtlArray{Int64, 1}:
 1

julia> b = convert(Array{Float32}, a)
1-element Vector{Float32}:
 1.0

julia> b .+ 1
1-element Vector{Float32}:
 2.0

Jul 04 '22 14:07 dbl001

Again, what are you trying to prove here? You're still broadcasting over a CPU vector. Please take the time to add some explanation, and not just dump code snippets; this is not helpful.

Jul 05 '22 06:07 maleadt